{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 1 设计自己的句子生成器"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "如何生成句子是一个很经典的问题，从1940s开始，图灵提出机器智能的时候，就使用的是人类能不能流畅和计算机进行对话。和计算机对话的一个前提是，计算机能够生成语言。\n",
    "\n",
    "计算机如何能生成语言是一个经典但是又很复杂的问题。 我们课程上为大家介绍的是一种基于规则（Rule Based）的生成方法。该方法虽然提出的时间早，但是现在依然在很多地方能够大显身手。值得说明的是，现在很多很实用的算法，都是很久之前提出的，例如，二分查找提出与1940s, Dijstra算法提出于1960s 等等。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1.1 思路\n",
    "1. 定义好语法规则\n",
    "2. 语句分割，字典存储，key：grammar, value: words\n",
    "3. 递归调用实现\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1.2 语法1，豆瓣影评打分"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 采用这种语法，报错，递归深度太深\n",
    "opinion2 = \"\"\"\n",
    "opinion = 时间 地点 影院 电影名称 副词 形容词\n",
    "时间 = 年 月 日\n",
    "年 = 数字 年\n",
    "月 = 数字 月\n",
    "日 = 数字 日\n",
    "数字 = 1 | 2 | 3 | 4\n",
    "地方 = 南京 | 上海 | 广东\n",
    "影院 = 卢米埃影城 | 幸福蓝海国际影城 | 时代华纳影城\n",
    "电影名称 = 湄公河行动 | 战狼 | 唐顿庄园 | 哈利波特\n",
    "副词 = 很 | 颇 | 极 | 十分\n",
    "形容词 = 燃 | 好看 | 难看 \n",
    "\"\"\"\n",
    "\n",
    "opinion = \"\"\"\n",
    "opinion = 电影名称 标点符号 副词 形容词 标点符号 评分 star \n",
    "电影名称 = 湄公河行动 | 战狼 | 唐顿庄园 | 哈利波特\n",
    "副词 = 很 | 颇 | 极 | 十分\n",
    "形容词 = 燃 | 还不错 | 难看 \n",
    "标点符号 = 。| !\n",
    "star = 1 | 2 | 3 | 4 \n",
    "\"\"\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 1.完成数据格式化，保存在字典中\n",
    "def grammar_process(grammar_str, split='='):\n",
    "    grammar_dict = {}\n",
    "    for sentences in grammar_str.split('\\n'):\n",
    "        if not sentences:\n",
    "            continue\n",
    "        sen, formula = sentences.split(split)\n",
    "#         print(sen, formula)\n",
    "        formulas = formula.split('|')\n",
    "#         print(formulas)\n",
    "        formulas = [f.split() for f in formulas]\n",
    "#         print(formulas)\n",
    "        grammar_dict[sen.strip()] = formulas\n",
    "        print(sen,\":\",formulas)\n",
    "    return grammar_dict"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "opinion  : [['电影名称', '标点符号', '副词', '形容词', '标点符号', '评分', 'star']]\n",
      "电影名称  : [['湄公河行动'], ['战狼'], ['唐顿庄园'], ['哈利波特']]\n",
      "副词  : [['很'], ['颇'], ['极'], ['十分']]\n",
      "形容词  : [['燃'], ['还不错'], ['难看']]\n",
      "标点符号  : [['。'], ['!']]\n",
      "star  : [['1'], ['2'], ['3'], ['4']]\n"
     ]
    }
   ],
   "source": [
    "grammar_dict = grammar_process(opinion)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 2. 递归调用，生成语句\n",
    "import random\n",
    "choice_a_exp = random.choice\n",
    "import sys \n",
    "sys.setrecursionlimit(1000000) #例如这里设置为一百万\n",
    "\n",
    "def sen_generate(grammar_dict: dict, target: str):\n",
    "    if target not in grammar_dict:\n",
    "        return target\n",
    "    expr = choice_a_exp(grammar_dict[target])\n",
    "    return ''.join(sen_generate(grammar_dict, t) for t in expr)\n",
    "    "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "唐顿庄园!极难看。评分4\n",
      "湄公河行动!很难看!评分2\n",
      "湄公河行动。很还不错。评分3\n",
      "唐顿庄园。颇还不错!评分4\n",
      "唐顿庄园!十分还不错!评分1\n",
      "战狼!颇燃!评分4\n",
      "唐顿庄园。极还不错!评分2\n",
      "唐顿庄园!极难看!评分4\n",
      "哈利波特。很燃。评分2\n",
      "哈利波特!很还不错。评分1\n"
     ]
    }
   ],
   "source": [
    "for i in range(10):\n",
    "    print(sen_generate(grammar_dict, \"opinion\"))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### * 问题\n",
    "1. 递归深度\n",
    "\n",
    "```\n",
    "---------------------------------------------------------------------------\n",
    "RecursionError                            Traceback (most recent call last)\n",
    "<ipython-input-136-0ca81b7e2880> in <module>\n",
    "----> 1 sen_generate(grammar_dict, \"opinion\")\n",
    "\n",
    "<ipython-input-135-40615388b3c0> in sen_generate(grammar_dict, target)\n",
    "      7         return target\n",
    "      8     expr = choice_a_exp(grammar_dict[target])\n",
    "----> 9     return ' '.join(sen_generate(grammar_dict, t) for t in expr)\n",
    "     10 \n",
    "\n",
    "<ipython-input-135-40615388b3c0> in <genexpr>(.0)\n",
    "      7         return target\n",
    "      8     expr = choice_a_exp(grammar_dict[target])\n",
    "----> 9     return ' '.join(sen_generate(grammar_dict, t) for t in expr)\n",
    "     10 \n",
    "\n",
    "... last 2 frames repeated, from the frame below ...\n",
    "\n",
    "<ipython-input-135-40615388b3c0> in sen_generate(grammar_dict, target)\n",
    "      7         return target\n",
    "      8     expr = choice_a_exp(grammar_dict[target])\n",
    "----> 9     return ' '.join(sen_generate(grammar_dict, t) for t in expr)\n",
    "     10 \n",
    "\n",
    "RecursionError: maximum recursion depth exceeded while calling a Python object\n",
    "```\n",
    "原因：超过递归深度\n",
    "解决：\n",
    "```\n",
    "import sys\n",
    "sys.setrecursionlimit(1000000) #例如这里设置为一百万\n",
    "```\n",
    "2. 语法过于复杂\n",
    "语法过于复杂的时候，会出现服务器挂调的情况\n",
    "需要重新优化代码"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1.3 语法2，星座性格"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "# |:或； &：与 ～：范围\n",
    "constellation = \"\"\"\n",
    "constellation =  生日: YY 月 XX 日, 对应, 星座性格\n",
    "YY = 1~12  \n",
    "XX = 1~30\n",
    "星座性格 = 水瓶座&智慧 | 双鱼座&浪漫  | 白羊座&直率 | 金牛座&可靠 | 双子座&机智 | 巨蟹座&真挚 | 狮子座&热心 | 处女座&保守 | 天秤座&和谐 | 天蝎座&狂妄 | 射手座&活泼 | 魔蝎座&原则\n",
    "\"\"\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 分割，扩展操作\n",
    "def test_process1(l, op='~'):\n",
    "    ll = []\n",
    "    for i in l:        \n",
    "        start, end = i.split(op)\n",
    "    for i in range(int(start), int(end)+1):\n",
    "        ll.append(str(i))\n",
    "    ll = [t.split() for t in ll]\n",
    "    return ll"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 替换操作\n",
    "def test_process2(l, op='&', rep=\"性格特点:\"):\n",
    "#     l = ['水瓶座&智慧']\n",
    "    lll = []\n",
    "    for i in l:\n",
    "        lll.append(i.replace(op, rep))\n",
    "    ll = [t.split() for t in lll]\n",
    "    return ll"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[['水瓶座性格特点:智慧'],\n",
       " ['双鱼座性格特点:浪漫'],\n",
       " ['白羊座性格特点:直率'],\n",
       " ['金牛座性格特点:可靠'],\n",
       " ['双子座性格特点:机智'],\n",
       " ['巨蟹座性格特点:真挚'],\n",
       " ['狮子座性格特点:热心'],\n",
       " ['处女座性格特点:保守'],\n",
       " ['天秤座性格特点:和谐'],\n",
       " ['天蝎座性格特点:狂妄'],\n",
       " ['射手座性格特点:活泼'],\n",
       " ['魔蝎座性格特点:原则']]"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "test_process2([' 水瓶座&智慧 ', ' 双鱼座&浪漫  ', ' 白羊座&直率 ', ' 金牛座&可靠 ', ' 双子座&机智 ', ' 巨蟹座&真挚 ', ' 狮子座&热心 ', ' 处女座&保守 ', ' 天秤座&和谐 ', ' 天蝎座&狂妄 ', ' 射手座&活泼 ', ' 魔蝎座&原则'])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 1.星座出生日期数据格式化，保存在字典中，出生年月日->星座->性格\n",
    "def grammarC_process(grammar_str, split='='):\n",
    "    grammar_dict = {}\n",
    "    for sentences in grammar_str.split('\\n'):\n",
    "        if not sentences:\n",
    "            continue\n",
    "        sen, formula = sentences.split(split)\n",
    "        formulas = formula.split('|')\n",
    "        ll = []\n",
    "        if \"~\" in str(formulas):\n",
    "            formulas = test_process1(formulas)\n",
    "        elif \"&\" in str(formulas):\n",
    "            formulas = test_process2(formulas)\n",
    "        else:                               \n",
    "            formulas = [f.split() for f in formulas]\n",
    "        grammar_dict[sen.strip()] = formulas   \n",
    "    return grammar_dict"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [],
   "source": [
    "grammar_dict = grammarC_process(constellation)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 2. 递归调用，生成语句\n",
    "import random\n",
    "choice_a_exp = random.choice\n",
    "import sys \n",
    "sys.setrecursionlimit(1000000) #例如这里设置为一百万\n",
    "\n",
    "def senC_generate(grammar_dict: dict, target: str):\n",
    "    if target not in grammar_dict:\n",
    "        return target\n",
    "    expr = choice_a_exp(grammar_dict[target])\n",
    "    return ''.join(sen_generate(grammar_dict, t) for t in expr)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "生日:11月17日,对应,狮子座性格特点:热心\n",
      "生日:12月24日,对应,水瓶座性格特点:智慧\n",
      "生日:7月6日,对应,射手座性格特点:活泼\n",
      "生日:3月17日,对应,巨蟹座性格特点:真挚\n",
      "生日:3月11日,对应,白羊座性格特点:直率\n",
      "生日:6月6日,对应,金牛座性格特点:可靠\n",
      "生日:7月2日,对应,射手座性格特点:活泼\n",
      "生日:8月26日,对应,天秤座性格特点:和谐\n",
      "生日:10月1日,对应,双鱼座性格特点:浪漫\n",
      "生日:8月4日,对应,狮子座性格特点:热心\n",
      "生日:8月2日,对应,天蝎座性格特点:狂妄\n",
      "生日:10月28日,对应,巨蟹座性格特点:真挚\n",
      "生日:6月10日,对应,白羊座性格特点:直率\n",
      "生日:7月28日,对应,金牛座性格特点:可靠\n",
      "生日:6月11日,对应,魔蝎座性格特点:原则\n",
      "生日:1月6日,对应,巨蟹座性格特点:真挚\n",
      "生日:12月10日,对应,处女座性格特点:保守\n",
      "生日:11月14日,对应,处女座性格特点:保守\n",
      "生日:5月30日,对应,狮子座性格特点:热心\n",
      "生日:3月14日,对应,巨蟹座性格特点:真挚\n"
     ]
    }
   ],
   "source": [
    "for i in range(20):\n",
    "    print(senC_generate(grammar_dict, \"constellation\"))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 2 使用新数据完成语言模型训练"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2.1 数据预处理\n",
    "- 表格数据提取\n",
    "- 数据预处理，去除噪音\n",
    "- 分词"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [],
   "source": [
    "import random"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "21"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "random.choice(range(100))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [],
   "source": [
    "filename = 'data/movie_comments.csv'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/Users/stone/anaconda3/lib/python3.7/site-packages/IPython/core/interactiveshell.py:3049: DtypeWarning: Columns (0,4) have mixed types. Specify dtype option on import or set low_memory=False.\n",
      "  interactivity=interactivity, compiler=compiler, result=result)\n"
     ]
    }
   ],
   "source": [
    "content = pd.read_csv(filename, encoding='utf-8')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 163,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>id</th>\n",
       "      <th>link</th>\n",
       "      <th>name</th>\n",
       "      <th>comment</th>\n",
       "      <th>star</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>https://movie.douban.com/subject/26363254/</td>\n",
       "      <td>战狼2</td>\n",
       "      <td>吴京意淫到了脑残的地步，看了恶心想吐</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2</td>\n",
       "      <td>https://movie.douban.com/subject/26363254/</td>\n",
       "      <td>战狼2</td>\n",
       "      <td>首映礼看的。太恐怖了这个电影，不讲道理的，完全就是吴京在实现他这个小粉红的英雄梦。各种装备轮...</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>3</td>\n",
       "      <td>https://movie.douban.com/subject/26363254/</td>\n",
       "      <td>战狼2</td>\n",
       "      <td>吴京的炒作水平不输冯小刚，但小刚至少不会用主旋律来炒作…吴京让人看了不舒服，为了主旋律而主旋...</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>4</td>\n",
       "      <td>https://movie.douban.com/subject/26363254/</td>\n",
       "      <td>战狼2</td>\n",
       "      <td>凭良心说，好看到不像《战狼1》的续集，完虐《湄公河行动》。</td>\n",
       "      <td>4</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>5</td>\n",
       "      <td>https://movie.douban.com/subject/26363254/</td>\n",
       "      <td>战狼2</td>\n",
       "      <td>中二得很</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  id                                        link name  \\\n",
       "0  1  https://movie.douban.com/subject/26363254/  战狼2   \n",
       "1  2  https://movie.douban.com/subject/26363254/  战狼2   \n",
       "2  3  https://movie.douban.com/subject/26363254/  战狼2   \n",
       "3  4  https://movie.douban.com/subject/26363254/  战狼2   \n",
       "4  5  https://movie.douban.com/subject/26363254/  战狼2   \n",
       "\n",
       "                                             comment star  \n",
       "0                                 吴京意淫到了脑残的地步，看了恶心想吐    1  \n",
       "1  首映礼看的。太恐怖了这个电影，不讲道理的，完全就是吴京在实现他这个小粉红的英雄梦。各种装备轮...    2  \n",
       "2  吴京的炒作水平不输冯小刚，但小刚至少不会用主旋律来炒作…吴京让人看了不舒服，为了主旋律而主旋...    2  \n",
       "3                      凭良心说，好看到不像《战狼1》的续集，完虐《湄公河行动》。    4  \n",
       "4                                               中二得很    1  "
      ]
     },
     "execution_count": 163,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "content.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [],
   "source": [
    "articles = content['comment'].tolist()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "261497"
      ]
     },
     "execution_count": 26,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "len(articles)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "吴京意淫到了脑残的地步，看了恶心想吐\n",
      "首映礼看的。太恐怖了这个电影，不讲道理的，完全就是吴京在实现他这个小粉红的英雄梦。各种装备轮番上场，视物理逻辑于不顾，不得不说有钱真好，随意胡闹\n",
      "吴京的炒作水平不输冯小刚，但小刚至少不会用主旋律来炒作…吴京让人看了不舒服，为了主旋律而主旋律，为了煽情而煽情，让人觉得他是个大做作、大谎言家。（7.29更新）片子整体不如湄公河行动，1.整体不够流畅，编剧有毒，台词尴尬；2.刻意做作的主旋律煽情显得如此不合时宜而又多余。\n",
      "凭良心说，好看到不像《战狼1》的续集，完虐《湄公河行动》。\n",
      "中二得很\n",
      "“犯我中华者，虽远必诛”，吴京比这句话还要意淫一百倍。\n",
      "脑子是个好东西，希望编剧们都能有。\n",
      "三星半，实打实的7分。第一集在爱国主旋律内部做着各种置换与较劲，但第二集才真正显露吴京的野心，他终于抛弃李忠志了，新增外来班底让硬件实力有机会和国际接轨，开篇水下长镜头和诸如铁丝网拦截RPG弹头的细节设计都让国产动作片重新封顶，在理念上，它甚至做到《绣春刀2》最想做到的那部分。\n",
      "开篇长镜头惊险大气引人入胜 结合了水平不俗的快剪下实打实的真刀真枪 让人不禁热血沸腾 特别弹簧床架挡炸弹 空手接碎玻璃 弹匣割喉等帅得飞起！就算前半段铺垫节奏散漫主角光环开太大等也不怕 作为一个中国人 两个小时弥漫着中国强大得不可侵犯的氛围 还是让那颗民族自豪心砰砰砰跳个不停。\n",
      "15/100吴京的冷峰在这部里即像成龙，又像杰森斯坦森，但体制外的同类型电影，主角总是代表个人，无能的政府需要求助于这些英雄才能解决难题，体现的是个人的价值，所以主旋律照抄这种模式实际上是有问题的。我们以前嘲笑个人英雄主义，却没想到捆绑爱国主义的全能战士更加难以下咽。\n",
      "犯我中华者虽远必诛，是有多无脑才信这句话。\n",
      "这部戏让人看的热血沸腾，对吴京路转粉，最后的彩蛋，让我们没有理由不期待下一部。\n",
      "假嗨，特别恶心的电影。\n",
      "有几处情节设置过于尴尬，彰显国家自豪感的部分稍显突兀。\n",
      "就是一部爽片，打戏挺燃，但是故事一般。达康书记不合适这个角色，赵东来倒是很合适。张瀚太太太违和了，分分钟穿越回偶像剧。\n",
      "赵东来：达康书记，我们接到在非洲卧底的冷锋报告，丁义珍现在非洲，我们请求抓捕。李达康：东来，这件事先不要声张，特别是别让省厅知道，就你和我一起去非洲，加上冷锋同志，三人逮捕丁义珍。这次行就叫战狼2吧\n",
      "下一部拍喜剧吧，整个片子真感觉挺搞笑的\n",
      "《战狼2》里吴京这么能打，他打得过徐晓冬么？\n",
      "心往一处想，劲往一处使，就能实现我们的梦想。看吧，比第一部好太多了。谢谢美队的动作指导。\n",
      "这都能火。是我没见识！\n",
      "开头的水下长对决戏可算华语电影的顶尖存在；驱逐舰、导弹和坦克在商业片里这么狂用也是了得；镜头运用和笑点插入都很好莱坞爆米花，不功不过；从头打到尾是真拼，虽然镜头也有略乱时；因为没啥期望值，所以被吴京的野心吓了一跳；吴刚、于谦和丁海峰老三位像炖烂熟的牛筋，嚼着就舒服。\n",
      "很用心啊吴京导演，小看你了，确实在导演上下功夫了拉片子了，知道借鉴是好的。至于大家比较反感的小粉红情绪我觉得那些桥段都是主旋律必备啊是稍微有一点过但还可以接受。最好的地方是吴京节奏掌握得很好，知道张弛有度，这点很难得。\n",
      "犯我中华者虽远必诛，这句话一直在我脑子里回响\n",
      "片头海里那场动作戏看完就呆不下去了，太假太做作，提前离场。\n",
      "好看，这部戏让人看的热血沸腾，打戏挺燃的，吴京演技棒呆了\n",
      "符合“有钱了续集反而拍更差”这一放之四海而皆准的规律，场面越做越大，然而伴随着各种动作场面和特效场面的升级，这一部的叙事反而变得非常凌乱。格局颇大，想拍成《黑鹰坠落》，结果撑死最多也只是官方主旋律版的《敢死队》。吴京确实有野心，但论自我角色定位能力远不如同是动作演员出身的甄子丹。\n",
      "说喜欢这部片子的人不是装傻就是真傻，要不是真的没有别的可看肯定是不会选这部的，直男癌到令人发指，所有剧情走向也完全是九十年代那套照搬，审美这件事儿真不是一时半会儿能培养出来的。\n",
      "整部电影延续1的风格，热血。场面比1来的要大，打戏动作不错，吴京挺适合演军人的，电影之前的中国梦片段他都念的劲儿劲儿的。整体来说还不错，不过张翰太违和了，一出来就一股雷阵雨的画风。\n",
      "目瞪狗呆！太瘠薄好看了！中国人牛b就是硬道理！隔壁建军大爷都没你们爱国\n",
      "《战狼2》的动作场景和战斗装备全线升级，热血的打斗动作从头打到尾。《战狼2》游走在电影审查红线的边界和政治安全的缝隙，是部延续了第一部极具煽动爱国情绪的国产动作大片。如此制作精良的影片，还请多来一点。\n",
      "电影用的胶卷挺差的，故事过度也差，地方部队还没太多展示就死去，反正各种问题。但就是能吸引人看下去，就冲这，为什么要这么鄙视敢想敢去开拓的人，不允许他们再去拍，直到能有更好的人，拍出更棒的更出彩的电影来呢？\n",
      "火爆的场面拍出了好莱坞大片的感觉，本片必将燃爆暑期\n",
      "吴京厉害了，身为武打演员，能拍到这么高标准的大场面的枪战戏，为你点赞。热血男儿，荷尔蒙爆发！\n",
      "能给0星么，好恶心啊\n",
      "《血战钢锯岭》中国人也会觉得好看，因为它歌颂的宗教情怀是超越政权的；但当你只想歌颂一个政权时，很明显就低了一个层次，甚至充满了现实乃至投机的考量，高下立见\n",
      "请问吴京脑残，弹簧床能挡火箭炮吗？\n",
      "上一部是傲气雄鹰，这一部是第一滴血4。吴京算是国内导演对类型片感觉比较准的，作为动作片钱都花在有效地方，整体火爆流畅，有大片气魄，创作上也足够真诚。人物设计也都不错，连张翰都很可爱了。如果吴京不像当年甄子丹那样一时膨胀、在银幕上独占聚光灯，肯定可以走得更远。\n",
      "扪心自问这种电影真没法评价，全片靠动作戏撑完，文戏都是扯淡，女主角毫无存在的必要，故事不需要逻辑只要主角开挂，但牛逼之处在于全片都透露着极强烈的爱国主义光环和意识形态枷锁，在祖国面前，一切反动派都是纸老虎，所以战狼一个人开挂团灭一个连都是合情合理的，动作戏还不错，挺用心，两星鼓励\n",
      "扪心自问这种电影真没法评价，全片靠动作戏撑完，文戏都是扯淡，女主角毫无存在的必要，故事不需要逻辑只要主角开挂，但牛逼之处在于全片都透露着极强烈的爱国主义光环和意识形态枷锁，在祖国面前，一切反动派都是纸老虎，所以战狼一个人开挂团灭一个连都是合情合理的，动作戏还不错，挺用心，两星鼓励\n",
      "两星给打戏，其他一般般，没啥看点，还有点尴尬？\n",
      "太尴尬了！！！！手接炸弹！！哈哈哈！！！从张翰出来之后，我就想炸了他！！\n",
      "翻了一下我给第一部的评价是四星，当时觉得挺燃的，这部其实在完成度上更接近好莱坞的制作了，每个步骤每个人物的走向都很顺滑，没有任何出人意料的地方。只给三星是因为，看看最近现实世界的一切，抱歉我在影院里燃不起来，只是觉得一切都很魔幻，当然开头的强拆是最有现实感的一幕了。\n",
      "太喜欢《战狼2》开场6分钟长镜头的水下搏斗戏了！从来没有在其它任何一部电影里看到过，因为拍摄难度真的不一般，同时还对演员有各种技能方面的要求。看完片子回来搜了下，被吴京会游泳、潜水、滑雪、开飞机、开坦克、射击等各项技能，还特意去特种部队当过18个月兵…真的很佩服这样的电影人！\n",
      "3星半。1.电影结束有掌声出现，近期少见。2.一粒爱国主义大补丸，有人吃的开心，有人觉得补大了。3.从头打到尾，从白打到黑。4.从片头字幕到影片细节，完全展现了吴京作为一个超级直男的糙和猛。主角光环媲美终结者。5.达康书记无亮点，张翰变谐星。6.3D？？？7.导演的掌控能力逼近Hold不住的边缘。\n",
      "打戏非常带感，燃爆了，拳拳到肉，看得超爽！\n",
      "吴京确实很聪明，很鸡贼。在一面大旗下呈现了一出重工业娱乐电影。他一直调控着说教和娱乐的比例，娱乐多了，尺度不被允许，说教多了，大众不接纳，比例把握非常微妙。这其中还是有一些“奇侠”化的内容，比如用玻璃碴子当飞镖杀敌一类，只不过被遮盖掉了。“老爹”演过美剧《搏击王国》，力荐那部美剧\n",
      "作为主旋律影片为啥用《奇异恩典》配乐，画内镜头还是中国军人……\n",
      "男生看这部电影的话，应该会很喜欢，因为很刺激肾上腺素，如果是女生，冷锋对龙小云的感情也会十分打动你，真的！\n",
      "无脑动作片，模仿许多好莱坞大场面再想怎么玩怎么玩一股脑堆，槽点多到炸，几位主角血厚到科幻级别，吴京重复演满血，红血，中毒，极速回血，爆种打通全场...确实很拼但片子太过投机取巧，炸穿银幕连迈克尔贝都不受待见了，国片还前仆后继炸不停，故事不好看堆再多大场面大爆炸假high瞎燃也没用。5/10\n",
      "吴京：这种女人就缺我这样的男人征服。心往一处想，劲往一处使，就能实现吴京直男癌的中国梦🇨🇳\n"
     ]
    }
   ],
   "source": [
    "for i in range(50):\n",
    "    print(articles[i])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [],
   "source": [
    "import re\n",
    "def token(string):\n",
    "    return re.findall('\\w+', string)    "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "['吴京意淫到了脑残的地步', '看了恶心想吐']\n",
      "['首映礼看的', '太恐怖了这个电影', '不讲道理的', '完全就是吴京在实现他这个小粉红的英雄梦', '各种装备轮番上场', '视物理逻辑于不顾', '不得不说有钱真好', '随意胡闹']\n",
      "['吴京的炒作水平不输冯小刚', '但小刚至少不会用主旋律来炒作', '吴京让人看了不舒服', '为了主旋律而主旋律', '为了煽情而煽情', '让人觉得他是个大做作', '大谎言家', '7', '29更新', '片子整体不如湄公河行动', '1', '整体不够流畅', '编剧有毒', '台词尴尬', '2', '刻意做作的主旋律煽情显得如此不合时宜而又多余']\n",
      "['凭良心说', '好看到不像', '战狼1', '的续集', '完虐', '湄公河行动']\n",
      "['中二得很']\n",
      "['犯我中华者', '虽远必诛', '吴京比这句话还要意淫一百倍']\n",
      "['脑子是个好东西', '希望编剧们都能有']\n",
      "['三星半', '实打实的7分', '第一集在爱国主旋律内部做着各种置换与较劲', '但第二集才真正显露吴京的野心', '他终于抛弃李忠志了', '新增外来班底让硬件实力有机会和国际接轨', '开篇水下长镜头和诸如铁丝网拦截RPG弹头的细节设计都让国产动作片重新封顶', '在理念上', '它甚至做到', '绣春刀2', '最想做到的那部分']\n",
      "['开篇长镜头惊险大气引人入胜', '结合了水平不俗的快剪下实打实的真刀真枪', '让人不禁热血沸腾', '特别弹簧床架挡炸弹', '空手接碎玻璃', '弹匣割喉等帅得飞起', '就算前半段铺垫节奏散漫主角光环开太大等也不怕', '作为一个中国人', '两个小时弥漫着中国强大得不可侵犯的氛围', '还是让那颗民族自豪心砰砰砰跳个不停']\n",
      "['15', '100吴京的冷峰在这部里即像成龙', '又像杰森斯坦森', '但体制外的同类型电影', '主角总是代表个人', '无能的政府需要求助于这些英雄才能解决难题', '体现的是个人的价值', '所以主旋律照抄这种模式实际上是有问题的', '我们以前嘲笑个人英雄主义', '却没想到捆绑爱国主义的全能战士更加难以下咽']\n",
      "['犯我中华者虽远必诛', '是有多无脑才信这句话']\n",
      "['这部戏让人看的热血沸腾', '对吴京路转粉', '最后的彩蛋', '让我们没有理由不期待下一部']\n",
      "['假嗨', '特别恶心的电影']\n",
      "['有几处情节设置过于尴尬', '彰显国家自豪感的部分稍显突兀']\n",
      "['就是一部爽片', '打戏挺燃', '但是故事一般', '达康书记不合适这个角色', '赵东来倒是很合适', '张瀚太太太违和了', '分分钟穿越回偶像剧']\n",
      "['赵东来', '达康书记', '我们接到在非洲卧底的冷锋报告', '丁义珍现在非洲', '我们请求抓捕', '李达康', '东来', '这件事先不要声张', '特别是别让省厅知道', '就你和我一起去非洲', '加上冷锋同志', '三人逮捕丁义珍', '这次行就叫战狼2吧']\n",
      "['下一部拍喜剧吧', '整个片子真感觉挺搞笑的']\n",
      "['战狼2', '里吴京这么能打', '他打得过徐晓冬么']\n",
      "['心往一处想', '劲往一处使', '就能实现我们的梦想', '看吧', '比第一部好太多了', '谢谢美队的动作指导']\n",
      "['这都能火', '是我没见识']\n"
     ]
    }
   ],
   "source": [
    "for i in range(20):\n",
    "    print(token(articles[i]))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2.2 切词，语言模型构建"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 分词\n",
    "import jieba\n",
    "\n",
    "articles_clean = [''.join(token(str(a))) for a in articles]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "['吴京意淫到了脑残的地步看了恶心想吐', '首映礼看的太恐怖了这个电影不讲道理的完全就是吴京在实现他这个小粉红的英雄梦各种装备轮番上场视物理逻辑于不顾不得不说有钱真好随意胡闹', '吴京的炒作水平不输冯小刚但小刚至少不会用主旋律来炒作吴京让人看了不舒服为了主旋律而主旋律为了煽情而煽情让人觉得他是个大做作大谎言家729更新片子整体不如湄公河行动1整体不够流畅编剧有毒台词尴尬2刻意做作的主旋律煽情显得如此不合时宜而又多余', '凭良心说好看到不像战狼1的续集完虐湄公河行动', '中二得很', '犯我中华者虽远必诛吴京比这句话还要意淫一百倍', '脑子是个好东西希望编剧们都能有', '三星半实打实的7分第一集在爱国主旋律内部做着各种置换与较劲但第二集才真正显露吴京的野心他终于抛弃李忠志了新增外来班底让硬件实力有机会和国际接轨开篇水下长镜头和诸如铁丝网拦截RPG弹头的细节设计都让国产动作片重新封顶在理念上它甚至做到绣春刀2最想做到的那部分', '开篇长镜头惊险大气引人入胜结合了水平不俗的快剪下实打实的真刀真枪让人不禁热血沸腾特别弹簧床架挡炸弹空手接碎玻璃弹匣割喉等帅得飞起就算前半段铺垫节奏散漫主角光环开太大等也不怕作为一个中国人两个小时弥漫着中国强大得不可侵犯的氛围还是让那颗民族自豪心砰砰砰跳个不停', '15100吴京的冷峰在这部里即像成龙又像杰森斯坦森但体制外的同类型电影主角总是代表个人无能的政府需要求助于这些英雄才能解决难题体现的是个人的价值所以主旋律照抄这种模式实际上是有问题的我们以前嘲笑个人英雄主义却没想到捆绑爱国主义的全能战士更加难以下咽', '犯我中华者虽远必诛是有多无脑才信这句话', '这部戏让人看的热血沸腾对吴京路转粉最后的彩蛋让我们没有理由不期待下一部', '假嗨特别恶心的电影', '有几处情节设置过于尴尬彰显国家自豪感的部分稍显突兀', '就是一部爽片打戏挺燃但是故事一般达康书记不合适这个角色赵东来倒是很合适张瀚太太太违和了分分钟穿越回偶像剧', '赵东来达康书记我们接到在非洲卧底的冷锋报告丁义珍现在非洲我们请求抓捕李达康东来这件事先不要声张特别是别让省厅知道就你和我一起去非洲加上冷锋同志三人逮捕丁义珍这次行就叫战狼2吧', '下一部拍喜剧吧整个片子真感觉挺搞笑的', '战狼2里吴京这么能打他打得过徐晓冬么', '心往一处想劲往一处使就能实现我们的梦想看吧比第一部好太多了谢谢美队的动作指导', '这都能火是我没见识', '开头的水下长对决戏可算华语电影的顶尖存在驱逐舰导弹和坦克在商业片里这么狂用也是了得镜头运用和笑点插入都很好莱坞爆米花不功不过从头打到尾是真拼虽然镜头也有略乱时因为没啥期望值所以被吴京的野心吓了一跳吴刚于谦和丁海峰老三位像炖烂熟的牛筋嚼着就舒服', '很用心啊吴京导演小看你了确实在导演上下功夫了拉片子了知道借鉴是好的至于大家比较反感的小粉红情绪我觉得那些桥段都是主旋律必备啊是稍微有一点过但还可以接受最好的地方是吴京节奏掌握得很好知道张弛有度这点很难得', '犯我中华者虽远必诛这句话一直在我脑子里回响', '片头海里那场动作戏看完就呆不下去了太假太做作提前离场', '好看这部戏让人看的热血沸腾打戏挺燃的吴京演技棒呆了', '符合有钱了续集反而拍更差这一放之四海而皆准的规律场面越做越大然而伴随着各种动作场面和特效场面的升级这一部的叙事反而变得非常凌乱格局颇大想拍成黑鹰坠落结果撑死最多也只是官方主旋律版的敢死队吴京确实有野心但论自我角色定位能力远不如同是动作演员出身的甄子丹', '说喜欢这部片子的人不是装傻就是真傻要不是真的没有别的可看肯定是不会选这部的直男癌到令人发指所有剧情走向也完全是九十年代那套照搬审美这件事儿真不是一时半会儿能培养出来的', '整部电影延续1的风格热血场面比1来的要大打戏动作不错吴京挺适合演军人的电影之前的中国梦片段他都念的劲儿劲儿的整体来说还不错不过张翰太违和了一出来就一股雷阵雨的画风', '目瞪狗呆太瘠薄好看了中国人牛b就是硬道理隔壁建军大爷都没你们爱国', '战狼2的动作场景和战斗装备全线升级热血的打斗动作从头打到尾战狼2游走在电影审查红线的边界和政治安全的缝隙是部延续了第一部极具煽动爱国情绪的国产动作大片如此制作精良的影片还请多来一点', '电影用的胶卷挺差的故事过度也差地方部队还没太多展示就死去反正各种问题但就是能吸引人看下去就冲这为什么要这么鄙视敢想敢去开拓的人不允许他们再去拍直到能有更好的人拍出更棒的更出彩的电影来呢', '火爆的场面拍出了好莱坞大片的感觉本片必将燃爆暑期', '吴京厉害了身为武打演员能拍到这么高标准的大场面的枪战戏为你点赞热血男儿荷尔蒙爆发', '能给0星么好恶心啊', '血战钢锯岭中国人也会觉得好看因为它歌颂的宗教情怀是超越政权的但当你只想歌颂一个政权时很明显就低了一个层次甚至充满了现实乃至投机的考量高下立见', '请问吴京脑残弹簧床能挡火箭炮吗', '上一部是傲气雄鹰这一部是第一滴血4吴京算是国内导演对类型片感觉比较准的作为动作片钱都花在有效地方整体火爆流畅有大片气魄创作上也足够真诚人物设计也都不错连张翰都很可爱了如果吴京不像当年甄子丹那样一时膨胀在银幕上独占聚光灯肯定可以走得更远', '扪心自问这种电影真没法评价全片靠动作戏撑完文戏都是扯淡女主角毫无存在的必要故事不需要逻辑只要主角开挂但牛逼之处在于全片都透露着极强烈的爱国主义光环和意识形态枷锁在祖国面前一切反动派都是纸老虎所以战狼一个人开挂团灭一个连都是合情合理的动作戏还不错挺用心两星鼓励', '扪心自问这种电影真没法评价全片靠动作戏撑完文戏都是扯淡女主角毫无存在的必要故事不需要逻辑只要主角开挂但牛逼之处在于全片都透露着极强烈的爱国主义光环和意识形态枷锁在祖国面前一切反动派都是纸老虎所以战狼一个人开挂团灭一个连都是合情合理的动作戏还不错挺用心两星鼓励', '两星给打戏其他一般般没啥看点还有点尴尬', '太尴尬了手接炸弹哈哈哈从张翰出来之后我就想炸了他', '翻了一下我给第一部的评价是四星当时觉得挺燃的这部其实在完成度上更接近好莱坞的制作了每个步骤每个人物的走向都很顺滑没有任何出人意料的地方只给三星是因为看看最近现实世界的一切抱歉我在影院里燃不起来只是觉得一切都很魔幻当然开头的强拆是最有现实感的一幕了', '太喜欢战狼2开场6分钟长镜头的水下搏斗戏了从来没有在其它任何一部电影里看到过因为拍摄难度真的不一般同时还对演员有各种技能方面的要求看完片子回来搜了下被吴京会游泳潜水滑雪开飞机开坦克射击等各项技能还特意去特种部队当过18个月兵真的很佩服这样的电影人', '3星半1电影结束有掌声出现近期少见2一粒爱国主义大补丸有人吃的开心有人觉得补大了3从头打到尾从白打到黑4从片头字幕到影片细节完全展现了吴京作为一个超级直男的糙和猛主角光环媲美终结者5达康书记无亮点张翰变谐星63D7导演的掌控能力逼近Hold不住的边缘', '打戏非常带感燃爆了拳拳到肉看得超爽', '吴京确实很聪明很鸡贼在一面大旗下呈现了一出重工业娱乐电影他一直调控着说教和娱乐的比例娱乐多了尺度不被允许说教多了大众不接纳比例把握非常微妙这其中还是有一些奇侠化的内容比如用玻璃碴子当飞镖杀敌一类只不过被遮盖掉了老爹演过美剧搏击王国力荐那部美剧', '作为主旋律影片为啥用奇异恩典配乐画内镜头还是中国军人', '男生看这部电影的话应该会很喜欢因为很刺激肾上腺素如果是女生冷锋对龙小云的感情也会十分打动你真的', '无脑动作片模仿许多好莱坞大场面再想怎么玩怎么玩一股脑堆槽点多到炸几位主角血厚到科幻级别吴京重复演满血红血中毒极速回血爆种打通全场确实很拼但片子太过投机取巧炸穿银幕连迈克尔贝都不受待见了国片还前仆后继炸不停故事不好看堆再多大场面大爆炸假high瞎燃也没用510', '吴京这种女人就缺我这样的男人征服心往一处想劲往一处使就能实现吴京直男癌的中国梦', '美国大片就能意淫国产的就不行美国的就打不死全都跳飞机跟跳墙一样中国就不行', '好莱坞总是美国总是拯救世界国产片就是中国梦想拯救非洲', '以现在的中印局势来对比这部电影假想的内容还真是挺讽刺的哈哈哈', '谄媚投机到恶心', '作为军旅题材给四星我觉得不过分质感燃到爆炸', '燃大场面真的不输国外大片不尴尬吴京打戏很精彩水下搏斗看着也很有力必须安利一下张翰这角色简直就是个彩蛋啊承包所有笑点为他量身定做的哈哈哈彭于晏可演不来是真的好看', '战狼2的制作明显比第一部升级了不少坦克漂移无人机突袭直升机坠露水下肉搏军舰导弹发射场面和动作再加上非洲叛乱国际化的视角完全是好莱坞大片的标配吴京饰演的冷锋更加深入人心如此搏命的精神在当下华语动作电影算是少见了期待第三部', '好燃啊好看表白吴京和达康书记', '燃大场面真的不输国外大片不尴尬吴京打戏很精彩水下搏斗看着也很有力必须安利一下张翰这角色简直就是个彩蛋啊承包所有笑点为他量身定做的哈哈哈彭于晏可演不来是真的好看', '战狼2的制作明显比第一部升级了不少坦克漂移无人机突袭直升机坠露水下肉搏军舰导弹发射场面和动作再加上非洲叛乱国际化的视角完全是好莱坞大片的标配吴京饰演的冷锋更加深入人心如此搏命的精神在当下华语动作电影算是少见了期待第三部', '好燃啊好看表白吴京和达康书记', '典型美国大片的方式每次都能猜对剧情没劲诶我就想问王牌特工就那么点杀人的镜头还经过艺术处理都直接删了战狼2这种血腥屠杀的镜头赤裸裸的大段大段的是怎么过的政治正确就有庇衣了', '意料之中的精彩意料之外的惊喜属于我们的英雄展现狼性的军魂', '几个网红拉出来弹弹琴你们就说燃了彰显我大国气象荷尔蒙满屏这TM才叫燃这部电影告诉我们中国人也是可以拯救世界的', '吴迪塞尔如入无人之境7亿大陆直男在这一刻集体勃起心往一处想劲往一处使你就能离开影厅不是个屌丝了', '同样是主旋律片子比电影开始前的我的中国梦要屌一万倍', '吴京这一次完全就是用超级英雄的标准来打造角色美式英雄主义与主旋律的违和是不可逆转的缺点各种笑料也一定程度地破坏了节奏感斥巨资炮制的大场面有所体验动作场面的流畅自然也比得上好莱坞水准但满到溢出却又影响了观感有着明显的优缺点但却会是受一般观众喜爱的院线电影两星半', '3d扣分', '和第一部同样精彩看完之后我热血澎拜啊', '纯粹拍的很难看啊', '集体癔症', '这个系列从1开始就跟吃了壮阳药似的', '张翰脸比女主白太多了请问用了什么护肤品', '客观的说七分虽然情节逻辑有各种经不起推敲的细节但总体完成度很高虽然反派有点无脑脸谱化但配角形象还算丰满尤其张瀚的富二代形象居然不招人讨厌有笑点有泪点是部用心的片子瑕不掩瑜值得鼓励', '在吴京的个人英雄幻想下连主旋律都沦为附庸了我tm还能说什么主角已经牛逼到突破逻辑的地步全天下超级无敌牛逼就是你了好啦好啦我都知道', '1看看人家装逼装得多专业2由于全国发布高温警报主角只好去非洲避暑了3Tundu我不要去中国中国太他妈热了我会被晒黑的4张翰在本片饰演亦凡整个太平洋都是他承包的鱼塘5达康书记东来局长都是幌子实际上反派是美队对冷锋的考验下集他将加入复联一起打灭霸', '三星半与首部一脉相承但脱离了军旅题材的限制变成了孤胆英雄动作片与第一滴血系列是一样的故事不新鲜但场面更大动作部分在技巧火爆之间切换整体非常燃片长有些长能看出拍摄时受限颇多有的镜头一看就是硬性指标但这样的片对拓展华语类型片有好处还是多多益善', '已三刷不明白为什么会有人说中二和吴京意淫这种类型的电影肯定要有一个英雄人物带动情节的发展真的好看全场无尿点谁会希望有战乱一些人或群体为了自己的私人利益发动战争但这苦的可是手无寸铁之力的民众祈求一个永远也实现不了的愿望世界和平黑子跪久了都站不起来了5分力荐', '张翰脸比女主白太多了请问用了什么护肤品', '客观的说七分虽然情节逻辑有各种经不起推敲的细节但总体完成度很高虽然反派有点无脑脸谱化但配角形象还算丰满尤其张瀚的富二代形象居然不招人讨厌有笑点有泪点是部用心的片子瑕不掩瑜值得鼓励', '在吴京的个人英雄幻想下连主旋律都沦为附庸了我tm还能说什么主角已经牛逼到突破逻辑的地步全天下超级无敌牛逼就是你了好啦好啦我都知道', '1看看人家装逼装得多专业2由于全国发布高温警报主角只好去非洲避暑了3Tundu我不要去中国中国太他妈热了我会被晒黑的4张翰在本片饰演亦凡整个太平洋都是他承包的鱼塘5达康书记东来局长都是幌子实际上反派是美队对冷锋的考验下集他将加入复联一起打灭霸', '三星半与首部一脉相承但脱离了军旅题材的限制变成了孤胆英雄动作片与第一滴血系列是一样的故事不新鲜但场面更大动作部分在技巧火爆之间切换整体非常燃片长有些长能看出拍摄时受限颇多有的镜头一看就是硬性指标但这样的片对拓展华语类型片有好处还是多多益善', '已三刷不明白为什么会有人说中二和吴京意淫这种类型的电影肯定要有一个英雄人物带动情节的发展真的好看全场无尿点谁会希望有战乱一些人或群体为了自己的私人利益发动战争但这苦的可是手无寸铁之力的民众祈求一个永远也实现不了的愿望世界和平黑子跪久了都站不起来了5分力荐', '样板戏走向全球', '捧高美国队长贬低战狼双重标准不要玩的太溜好莱坞玩爱美国就是高大上国内玩爱国就是假大空真不懂你们这些没有膝盖的人', '二十年前当我还是个懵懂的小孩子看到这样的故事和镜头我会被感动的哭鼻子如今二十年过去了你却还拿这样的故事逻辑和镜头给我看我只能尴尬的笑', '战狼2可以说用军舰坦克撞开了国产电影重工业的大门让观众了解到国产电影也可以像好莱坞大片一样可以有自己的超级英雄', '说剧情有bug我承认但你看的燃不燃high不high动作打得过瘾不过瘾看得心潮澎湃没有激动不国产商业动作片能拍到这个水准已经值得表扬了不给鼓励还在那挑刺呵呵', '这一星给开场拆房子的那场戏太不符合社会主义核心价值观了', '电影院一个人就买了四张票不打折请全家看心疼我的小钱钱还好家长们的反应是好的把吴京猛夸了一通中国式的动作大片并不逊色好莱坞只要他们满意我也就心满意足了我珍惜一家人和和气气的团聚时刻卢靖姗真漂亮干练明朗的健康美吴京好好拍第三部继续约啊', '听说过夜郎自大吗我第一次知道中国人这么牛逼中国部队所向披靡连特么坦克都能给你开漂移整部电影都是吴京一个人在意淫意淫真可怕中国维和单凭一个视频能越过联合国长驱直入到别国领土进行作战你当Africa是你家后院么意淫强国我身为天朝子民相当之荣幸扬我国威震我中华', '我跟你们讲吴京当年在电视上大吹牛逼说自己在杀破狼里和甄子丹真打多厉害多牛逼结果花絮里明明白白地展现了甄如何设计全套动作从头到尾手把手地教吴京', '中國狼不咬中國人www', '冷锋像是一个符号他代表着千千万万守护我们的军人国强则民安生活在这个没有战争的国度我们真的算是非常幸福每一个军人都值得我们尊敬这个强大的祖国也值得我们热爱如同影片的主旨中国护照不能带你去任何一个国家但能从任何一个地方把你平安带回家', '非常难看也不知道导演哪儿来的自信', '一部政治宣传片中国也就在非洲还有点脸面了给宏大的战争场面和吴京卖力的打斗五星剧情减一星政治倾向剪一星综合三星', '诚意满满全程无尿点吴京非常帅剧情比战狼1好看多了', '23点45分开始1点48分结束我看了12次手机影厅的天花板上有8条线组合成三角形装饰18个音响在棚顶前12个并列排放后面6个33一组一共24个探照灯第12个旁边有个摄像头另外中国护照上没有那句话', '有一场坦克戏简直令人浮想连连啊吴京真的不是故意的嘛又是压人压成肉泥的镜头又是吴京站在坦克正前面稍有常识的人都会看出如果敌方的铁骑继续前进然而吴京这个螳臂当车的歹徒真的阻挡住了']\n"
     ]
    }
   ],
   "source": [
    "print(articles_clean[:100])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "261497"
      ]
     },
     "execution_count": 32,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "len(articles_clean)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {},
   "outputs": [],
   "source": [
    "output = 'data/articles_douban.txt'\n",
    "with open(output, 'w') as f:\n",
    "    for a in articles_clean:\n",
    "        f.write(a + '\\n')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 68,
   "metadata": {},
   "outputs": [],
   "source": [
    "def seg_atrticles(string):\n",
    "    return list(jieba.cut(string))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 80,
   "metadata": {},
   "outputs": [],
   "source": [
    "# output_seg=\"../../data/assignment01/articles_douban_cut.txt\"\n",
    "output_seg=\"articles_douban_cut.txt\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 81,
   "metadata": {},
   "outputs": [],
   "source": [
    "words_list = []\n",
    "words_list = seg_atrticles(open(output).read())\n",
    "def seg2txt(articles_list, output):\n",
    "    line_dict = {}\n",
    "    with open(output, 'w') as f:\n",
    "        for i, line in enumerate(articles_list):\n",
    "            line_dict[i] = line\n",
    "        f.write(str(line_dict))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 82,
   "metadata": {},
   "outputs": [],
   "source": [
    "seg2txt(words_list, output_seg)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": 85,
   "metadata": {},
   "outputs": [],
   "source": [
    "def seg2list(output):\n",
    "    for i, line in enumerate((open(output))):\n",
    "        if i % 100 == 0:\n",
    "            print(i)\n",
    "        if i > 10000:\n",
    "            break\n",
    "        words_list += seg_atrticles(line)\n",
    "    return words_list"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 88,
   "metadata": {},
   "outputs": [],
   "source": [
    "from collections import Counter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 89,
   "metadata": {},
   "outputs": [],
   "source": [
    "from functools import reduce"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 90,
   "metadata": {},
   "outputs": [],
   "source": [
    "from operator import add, mul"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 97,
   "metadata": {},
   "outputs": [],
   "source": [
    "words_count = Counter(words_list)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 98,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[('的', 328262),\n",
       " ('\\n', 261497),\n",
       " ('了', 102420),\n",
       " ('是', 73106),\n",
       " ('我', 50338),\n",
       " ('都', 36255),\n",
       " ('很', 34712),\n",
       " ('看', 34022),\n",
       " ('电影', 33675),\n",
       " ('也', 32065),\n",
       " ('和', 31290),\n",
       " ('在', 31245),\n",
       " ('不', 28435),\n",
       " ('有', 27939),\n",
       " ('就', 25685),\n",
       " ('人', 23909),\n",
       " ('好', 22858),\n",
       " ('啊', 20803),\n",
       " ('这', 17484),\n",
       " ('还', 17449),\n",
       " ('一个', 17343),\n",
       " ('你', 17282),\n",
       " ('还是', 16425),\n",
       " ('但', 15578),\n",
       " ('故事', 15010),\n",
       " ('没有', 14343),\n",
       " ('就是', 14007),\n",
       " ('喜欢', 13566),\n",
       " ('让', 13304),\n",
       " ('太', 12676),\n",
       " ('又', 11566),\n",
       " ('剧情', 11359),\n",
       " ('没', 10858),\n",
       " ('说', 10764),\n",
       " ('吧', 10747),\n",
       " ('他', 10675),\n",
       " ('不错', 10416),\n",
       " ('得', 10349),\n",
       " ('到', 10341),\n",
       " ('给', 10300),\n",
       " ('这个', 10058),\n",
       " ('上', 10054),\n",
       " ('被', 9939),\n",
       " ('对', 9824),\n",
       " ('最后', 9694),\n",
       " ('一部', 9693),\n",
       " ('片子', 9590),\n",
       " ('什么', 9571),\n",
       " ('能', 9532),\n",
       " ('与', 9168),\n",
       " ('多', 8977),\n",
       " ('可以', 8972),\n",
       " ('不是', 8811),\n",
       " ('最', 8669),\n",
       " ('觉得', 8626),\n",
       " ('中', 8446),\n",
       " ('导演', 8390),\n",
       " ('自己', 8354),\n",
       " ('拍', 8172),\n",
       " ('好看', 8085),\n",
       " ('要', 8081),\n",
       " ('真的', 7908),\n",
       " ('感觉', 7828),\n",
       " ('但是', 7723),\n",
       " ('里', 7655),\n",
       " ('那', 7503),\n",
       " ('有点', 7479),\n",
       " ('想', 7442),\n",
       " ('这部', 7433),\n",
       " ('会', 7429),\n",
       " ('去', 7295),\n",
       " ('把', 7151),\n",
       " ('着', 7058),\n",
       " ('这么', 6784),\n",
       " ('小', 6626),\n",
       " ('个', 6546),\n",
       " ('而', 6507),\n",
       " ('这样', 6471),\n",
       " ('真是', 6449),\n",
       " ('那么', 6431),\n",
       " ('这种', 6377),\n",
       " ('片', 6333),\n",
       " ('不过', 6292),\n",
       " ('挺', 6244),\n",
       " ('时候', 6216),\n",
       " ('更', 6143),\n",
       " ('比', 6094),\n",
       " ('却', 5990),\n",
       " ('爱', 5909),\n",
       " ('我们', 5875),\n",
       " ('大', 5773),\n",
       " ('像', 5704),\n",
       " ('虽然', 5633),\n",
       " ('演技', 5631),\n",
       " ('其实', 5573),\n",
       " ('看到', 5450),\n",
       " ('知道', 5384),\n",
       " ('再', 5352),\n",
       " ('演员', 5328),\n",
       " ('那个', 5123)]"
      ]
     },
     "execution_count": 98,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "words_count.most_common(100)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 99,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "4751810"
      ]
     },
     "execution_count": 99,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "len(words_list)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 100,
   "metadata": {},
   "outputs": [],
   "source": [
    "import matplotlib.pylab as plt"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 101,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 103,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 一个单词出现的概率\n",
    "def prob_1(word):\n",
    "    return words_count[word] / len(words_list)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 104,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.0012363709828465365"
      ]
     },
     "execution_count": 104,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "prob_1(\"我们\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 109,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 两个词两两组合\n",
    "token_2gram = [''.join(words_list[i:i+2]) for i in range(len(words_list[:-2]))]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 110,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['吴京意淫', '意淫到', '到了', '了脑残', '脑残的', '的地步', '地步看', '看了', '了恶心', '恶心想']"
      ]
     },
     "execution_count": 110,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "token_2gram[:10]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 116,
   "metadata": {},
   "outputs": [],
   "source": [
    "word_count2 = Counter(token_2gram)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 117,
   "metadata": {},
   "outputs": [],
   "source": [
    "def prob_2(word1, word2):\n",
    "    if word1 + word2 in word_count2:\n",
    "        return word_count2[word1+word2] / len(token_2gram)\n",
    "    else:\n",
    "        return 1 / len(token_2gram)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 118,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2.0202836478241545e-05"
      ]
     },
     "execution_count": 118,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "prob_2('我们','在')\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 137,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 获取一个句子的概率\n",
    "def get_sen_prob(sentence):\n",
    "    words_seg = seg_atrticles(sentence)\n",
    "    sentence_prob = 1\n",
    "    for i, word in enumerate(words_seg[:-1]):\n",
    "        next_ = words_seg[i+1]\n",
    "        prob = prob_2(word, next_)\n",
    "        sentence_prob *= prob\n",
    "    sentence_prob *= prob_1(words_seg[-1])\n",
    "    return sentence_prob\n",
    "    "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 138,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2.705511236850992e-44"
      ]
     },
     "execution_count": 138,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "get_sen_prob('小明今天抽奖抽到一台苹果手机')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 188,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "sentence: 唐顿庄园!极还不错!评分2    with prob: 2.852266781508377e-53  \n",
      "sentence: 战狼。颇难看。评分2    with prob: 1.0986569279374638e-43  \n",
      "sentence: 唐顿庄园!极难看!评分2    with prob: 6.936245706502436e-50  \n",
      "sentence: 战狼。颇燃。评分4    with prob: 2.687230755520918e-44  \n",
      "sentence: 湄公河行动。十分燃。评分1    with prob: 3.118271392048038e-49  \n",
      "sentence: 湄公河行动!十分还不错!评分4    with prob: 3.720760660070827e-53  \n",
      "sentence: 唐顿庄园!很还不错。评分2    with prob: 2.852266781508377e-53  \n",
      "sentence: 战狼。十分还不错。评分3    with prob: 1.7409722817488258e-47  \n",
      "sentence: 战狼!很还不错!评分1    with prob: 1.904094468769333e-47  \n",
      "sentence: 唐顿庄园。十分燃。评分4    with prob: 6.786210441636326e-50  \n",
      "sentence: 唐顿庄园!很燃!评分4    with prob: 1.6123384533125515e-43  \n",
      "sentence: 哈利波特!极还不错。评分1    with prob: 4.0471656545403906e-52  \n",
      "sentence: 战狼!很燃!评分2    with prob: 2.610303389714332e-37  \n",
      "sentence: 湄公河行动!很难看。评分4    with prob: 7.510072888744198e-48  \n",
      "sentence: 战狼!很还不错!评分4    with prob: 1.1050212669131148e-47  \n",
      "sentence: 湄公河行动。颇还不错!评分3    with prob: 5.862096386887099e-53  \n",
      "sentence: 湄公河行动。颇难看。评分1    with prob: 3.118271392048038e-49  \n",
      "sentence: 唐顿庄园。极还不错!评分1    with prob: 2.4042568244794406e-53  \n",
      "sentence: 哈利波特。很燃!评分1    with prob: 4.676750380116892e-42  \n",
      "sentence: 唐顿庄园。十分还不错!评分1    with prob: 2.4042568244794406e-53  \n"
     ]
    }
   ],
   "source": [
    "for sen in [sen_generate(grammar_dict, \"opinion\") for i in range(20)]:\n",
    "#     print(sen)\n",
    "    print('sentence: {}    with prob: {}  '.format(sen, get_sen_prob(sen)))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 3 获得优质的语言\n",
    "当我们能够生成随机的语言并且能判断之后，我们就可以生成更加合理的语言了。请定义 generate_best 函数，该函数输入一个语法 + 语言模型，能够生成**n**个句子，并能选择一个最合理的句子: \n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3.1 优质影评选取 "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 248,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 思路：输入语法--》 语法生成句子--》句子经过语言模型判断（设置阈值）\n",
    "#  num: 生成语句的数量\n",
    "def sen2prob_generate(grammar_dict, target= \"opinion\", num = 1000):\n",
    "    # 生成句子。元组保存（sen, prob)\n",
    "    sen2prob = [] \n",
    "    sen_tuple = ()\n",
    "    for sen in [sen_generate(grammar_dict,target) for i in range(num)]:\n",
    "        sen_tuple = sen, get_sen_prob(sen)\n",
    "        sen2prob.append(sen_tuple)\n",
    "    return sen2prob"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 249,
   "metadata": {},
   "outputs": [],
   "source": [
    "def generate_best(sen2prob, threshold=2.335202721189153e-48):\n",
    "    senSorted = sorted(sen2prob, key=lambda x:x[1])\n",
    "    for sen, prob in senSorted:\n",
    "        if prob>threshold:\n",
    "            print('sentence: {} is good sentence, with prob: {} '.format(sen, prob))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 250,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "opinion  : [['电影名称', '标点符号', '副词', '形容词', '标点符号', '评分', 'star']]\n",
      "电影名称  : [['湄公河行动'], ['战狼'], ['唐顿庄园'], ['哈利波特']]\n",
      "副词  : [['很'], ['颇'], ['极'], ['十分']]\n",
      "形容词  : [['燃'], ['还不错'], ['难看']]\n",
      "标点符号  : [['。'], ['!']]\n",
      "star  : [['1'], ['2'], ['3'], ['4']]\n"
     ]
    }
   ],
   "source": [
    "grammar_dict = grammar_process(opinion) # 选择生成opinion格式的影评语法\n",
    "sen2prob = sen2prob_generate(grammar_dict)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 251,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "sentence: 唐顿庄园!很难看。评分4 is good sentence, with prob: 2.8162773332790756e-48 \n",
      "sentence: 唐顿庄园。很难看。评分4 is good sentence, with prob: 2.8162773332790756e-48 \n",
      "sentence: 唐顿庄园!很难看。评分4 is good sentence, with prob: 2.8162773332790756e-48 \n",
      "sentence: 唐顿庄园。很难看。评分4 is good sentence, with prob: 2.8162773332790756e-48 \n",
      "sentence: 唐顿庄园。很难看。评分4 is good sentence, with prob: 2.8162773332790756e-48 \n",
      "sentence: 唐顿庄园!很难看!评分4 is good sentence, with prob: 2.8162773332790756e-48 \n",
      "sentence: 唐顿庄园。很难看。评分3 is good sentence, with prob: 4.4370736761049925e-48 \n",
      "sentence: 唐顿庄园!很难看!评分3 is good sentence, with prob: 4.4370736761049925e-48 \n",
      "sentence: 唐顿庄园!很难看!评分3 is good sentence, with prob: 4.4370736761049925e-48 \n",
      "sentence: 唐顿庄园。很难看!评分3 is good sentence, with prob: 4.4370736761049925e-48 \n",
      "sentence: 唐顿庄园。很难看。评分3 is good sentence, with prob: 4.4370736761049925e-48 \n",
      "sentence: 唐顿庄园。很难看。评分3 is good sentence, with prob: 4.4370736761049925e-48 \n",
      "sentence: 唐顿庄园。很难看!评分3 is good sentence, with prob: 4.4370736761049925e-48 \n",
      "sentence: 唐顿庄园!很难看。评分3 is good sentence, with prob: 4.4370736761049925e-48 \n",
      "sentence: 唐顿庄园!很难看。评分3 is good sentence, with prob: 4.4370736761049925e-48 \n",
      "sentence: 唐顿庄园!很难看!评分3 is good sentence, with prob: 4.4370736761049925e-48 \n",
      "sentence: 唐顿庄园。很难看。评分3 is good sentence, with prob: 4.4370736761049925e-48 \n",
      "sentence: 唐顿庄园。很难看!评分1 is good sentence, with prob: 4.85280985387476e-48 \n",
      "sentence: 唐顿庄园。很难看。评分1 is good sentence, with prob: 4.85280985387476e-48 \n",
      "sentence: 唐顿庄园。很难看!评分1 is good sentence, with prob: 4.85280985387476e-48 \n",
      "sentence: 唐顿庄园!很难看!评分2 is good sentence, with prob: 5.757083936397022e-48 \n",
      "sentence: 唐顿庄园!很难看!评分2 is good sentence, with prob: 5.757083936397022e-48 \n",
      "sentence: 唐顿庄园!很难看!评分2 is good sentence, with prob: 5.757083936397022e-48 \n",
      "sentence: 湄公河行动。很难看!评分4 is good sentence, with prob: 7.510072888744198e-48 \n",
      "sentence: 湄公河行动!很难看。评分4 is good sentence, with prob: 7.510072888744198e-48 \n",
      "sentence: 湄公河行动。很难看。评分4 is good sentence, with prob: 7.510072888744198e-48 \n",
      "sentence: 湄公河行动!很难看!评分4 is good sentence, with prob: 7.510072888744198e-48 \n",
      "sentence: 湄公河行动!很难看!评分4 is good sentence, with prob: 7.510072888744198e-48 \n",
      "sentence: 湄公河行动!很难看!评分4 is good sentence, with prob: 7.510072888744198e-48 \n",
      "sentence: 湄公河行动。很难看!评分4 is good sentence, with prob: 7.510072888744198e-48 \n",
      "sentence: 湄公河行动!很难看!评分4 is good sentence, with prob: 7.510072888744198e-48 \n",
      "sentence: 战狼。极还不错。评分4 is good sentence, with prob: 1.1050212669131148e-47 \n",
      "sentence: 战狼!颇还不错。评分4 is good sentence, with prob: 1.1050212669131148e-47 \n",
      "sentence: 战狼!颇还不错。评分4 is good sentence, with prob: 1.1050212669131148e-47 \n",
      "sentence: 战狼。十分还不错!评分4 is good sentence, with prob: 1.1050212669131148e-47 \n",
      "sentence: 战狼。颇还不错。评分4 is good sentence, with prob: 1.1050212669131148e-47 \n",
      "sentence: 战狼!极还不错。评分4 is good sentence, with prob: 1.1050212669131148e-47 \n",
      "sentence: 战狼。很还不错!评分4 is good sentence, with prob: 1.1050212669131148e-47 \n",
      "sentence: 战狼!十分还不错。评分4 is good sentence, with prob: 1.1050212669131148e-47 \n",
      "sentence: 战狼!十分还不错!评分4 is good sentence, with prob: 1.1050212669131148e-47 \n",
      "sentence: 战狼!十分还不错!评分4 is good sentence, with prob: 1.1050212669131148e-47 \n",
      "sentence: 战狼!颇还不错。评分4 is good sentence, with prob: 1.1050212669131148e-47 \n",
      "sentence: 战狼。很还不错!评分4 is good sentence, with prob: 1.1050212669131148e-47 \n",
      "sentence: 战狼!十分还不错!评分4 is good sentence, with prob: 1.1050212669131148e-47 \n",
      "sentence: 战狼。极还不错!评分4 is good sentence, with prob: 1.1050212669131148e-47 \n",
      "sentence: 战狼!很还不错!评分4 is good sentence, with prob: 1.1050212669131148e-47 \n",
      "sentence: 战狼!颇还不错!评分4 is good sentence, with prob: 1.1050212669131148e-47 \n",
      "sentence: 战狼。十分还不错。评分4 is good sentence, with prob: 1.1050212669131148e-47 \n",
      "sentence: 战狼。很还不错!评分4 is good sentence, with prob: 1.1050212669131148e-47 \n",
      "sentence: 战狼。极还不错。评分4 is good sentence, with prob: 1.1050212669131148e-47 \n",
      "sentence: 战狼。极还不错!评分4 is good sentence, with prob: 1.1050212669131148e-47 \n",
      "sentence: 战狼!颇还不错。评分4 is good sentence, with prob: 1.1050212669131148e-47 \n",
      "sentence: 湄公河行动!很难看。评分3 is good sentence, with prob: 1.1832196469613308e-47 \n",
      "sentence: 湄公河行动。很难看。评分3 is good sentence, with prob: 1.1832196469613308e-47 \n",
      "sentence: 湄公河行动。很难看!评分3 is good sentence, with prob: 1.1832196469613308e-47 \n",
      "sentence: 湄公河行动!很难看!评分3 is good sentence, with prob: 1.1832196469613308e-47 \n",
      "sentence: 湄公河行动!很难看。评分3 is good sentence, with prob: 1.1832196469613308e-47 \n",
      "sentence: 湄公河行动。很难看。评分1 is good sentence, with prob: 1.2940826276999356e-47 \n",
      "sentence: 湄公河行动!很难看!评分1 is good sentence, with prob: 1.2940826276999356e-47 \n",
      "sentence: 湄公河行动。很难看!评分1 is good sentence, with prob: 1.2940826276999356e-47 \n",
      "sentence: 湄公河行动。很难看!评分1 is good sentence, with prob: 1.2940826276999356e-47 \n",
      "sentence: 湄公河行动!很难看!评分1 is good sentence, with prob: 1.2940826276999356e-47 \n",
      "sentence: 湄公河行动。很难看!评分2 is good sentence, with prob: 1.5352223830392052e-47 \n",
      "sentence: 湄公河行动!很难看!评分2 is good sentence, with prob: 1.5352223830392052e-47 \n",
      "sentence: 湄公河行动。很难看。评分2 is good sentence, with prob: 1.5352223830392052e-47 \n",
      "sentence: 战狼。很还不错。评分3 is good sentence, with prob: 1.7409722817488258e-47 \n",
      "sentence: 战狼!很还不错!评分3 is good sentence, with prob: 1.7409722817488258e-47 \n",
      "sentence: 战狼!十分还不错。评分3 is good sentence, with prob: 1.7409722817488258e-47 \n",
      "sentence: 战狼!很还不错!评分3 is good sentence, with prob: 1.7409722817488258e-47 \n",
      "sentence: 战狼。极还不错!评分3 is good sentence, with prob: 1.7409722817488258e-47 \n",
      "sentence: 战狼。很还不错。评分3 is good sentence, with prob: 1.7409722817488258e-47 \n",
      "sentence: 战狼。颇还不错。评分3 is good sentence, with prob: 1.7409722817488258e-47 \n",
      "sentence: 战狼。颇还不错。评分3 is good sentence, with prob: 1.7409722817488258e-47 \n",
      "sentence: 战狼!很还不错。评分3 is good sentence, with prob: 1.7409722817488258e-47 \n",
      "sentence: 战狼!十分还不错!评分3 is good sentence, with prob: 1.7409722817488258e-47 \n",
      "sentence: 战狼。很还不错。评分3 is good sentence, with prob: 1.7409722817488258e-47 \n",
      "sentence: 战狼!颇还不错!评分3 is good sentence, with prob: 1.7409722817488258e-47 \n",
      "sentence: 战狼。颇还不错!评分3 is good sentence, with prob: 1.7409722817488258e-47 \n",
      "sentence: 战狼!颇还不错!评分3 is good sentence, with prob: 1.7409722817488258e-47 \n",
      "sentence: 战狼。极还不错。评分3 is good sentence, with prob: 1.7409722817488258e-47 \n",
      "sentence: 战狼!很还不错。评分3 is good sentence, with prob: 1.7409722817488258e-47 \n",
      "sentence: 战狼!极还不错!评分3 is good sentence, with prob: 1.7409722817488258e-47 \n",
      "sentence: 战狼!极还不错。评分3 is good sentence, with prob: 1.7409722817488258e-47 \n",
      "sentence: 战狼!十分还不错!评分3 is good sentence, with prob: 1.7409722817488258e-47 \n",
      "sentence: 战狼!极还不错!评分3 is good sentence, with prob: 1.7409722817488258e-47 \n",
      "sentence: 战狼!极还不错。评分3 is good sentence, with prob: 1.7409722817488258e-47 \n",
      "sentence: 战狼!颇还不错!评分3 is good sentence, with prob: 1.7409722817488258e-47 \n",
      "sentence: 战狼。颇还不错。评分3 is good sentence, with prob: 1.7409722817488258e-47 \n",
      "sentence: 战狼。极还不错!评分1 is good sentence, with prob: 1.904094468769333e-47 \n",
      "sentence: 战狼!颇还不错!评分1 is good sentence, with prob: 1.904094468769333e-47 \n",
      "sentence: 战狼!极还不错。评分1 is good sentence, with prob: 1.904094468769333e-47 \n",
      "sentence: 战狼。极还不错!评分1 is good sentence, with prob: 1.904094468769333e-47 \n",
      "sentence: 战狼。很还不错!评分1 is good sentence, with prob: 1.904094468769333e-47 \n",
      "sentence: 战狼。极还不错。评分1 is good sentence, with prob: 1.904094468769333e-47 \n",
      "sentence: 战狼。很还不错。评分1 is good sentence, with prob: 1.904094468769333e-47 \n",
      "sentence: 战狼!十分还不错!评分1 is good sentence, with prob: 1.904094468769333e-47 \n",
      "sentence: 战狼!很还不错!评分1 is good sentence, with prob: 1.904094468769333e-47 \n",
      "sentence: 战狼!很还不错。评分1 is good sentence, with prob: 1.904094468769333e-47 \n",
      "sentence: 战狼!极还不错。评分1 is good sentence, with prob: 1.904094468769333e-47 \n",
      "sentence: 战狼!十分还不错!评分1 is good sentence, with prob: 1.904094468769333e-47 \n",
      "sentence: 战狼!极还不错!评分1 is good sentence, with prob: 1.904094468769333e-47 \n",
      "sentence: 战狼!十分还不错。评分1 is good sentence, with prob: 1.904094468769333e-47 \n",
      "sentence: 战狼。很还不错!评分1 is good sentence, with prob: 1.904094468769333e-47 \n",
      "sentence: 战狼!极还不错!评分1 is good sentence, with prob: 1.904094468769333e-47 \n",
      "sentence: 战狼!颇还不错!评分2 is good sentence, with prob: 2.258904018417626e-47 \n",
      "sentence: 战狼!十分还不错。评分2 is good sentence, with prob: 2.258904018417626e-47 \n",
      "sentence: 战狼。极还不错!评分2 is good sentence, with prob: 2.258904018417626e-47 \n",
      "sentence: 战狼。极还不错。评分2 is good sentence, with prob: 2.258904018417626e-47 \n",
      "sentence: 战狼!颇还不错!评分2 is good sentence, with prob: 2.258904018417626e-47 \n",
      "sentence: 战狼!十分还不错!评分2 is good sentence, with prob: 2.258904018417626e-47 \n",
      "sentence: 战狼!很还不错!评分2 is good sentence, with prob: 2.258904018417626e-47 \n",
      "sentence: 战狼!极还不错!评分2 is good sentence, with prob: 2.258904018417626e-47 \n",
      "sentence: 战狼。很还不错!评分2 is good sentence, with prob: 2.258904018417626e-47 \n",
      "sentence: 战狼。十分还不错!评分2 is good sentence, with prob: 2.258904018417626e-47 \n",
      "sentence: 战狼。颇还不错。评分2 is good sentence, with prob: 2.258904018417626e-47 \n",
      "sentence: 战狼!颇还不错。评分2 is good sentence, with prob: 2.258904018417626e-47 \n",
      "sentence: 战狼!很还不错!评分2 is good sentence, with prob: 2.258904018417626e-47 \n",
      "sentence: 战狼!极还不错!评分2 is good sentence, with prob: 2.258904018417626e-47 \n",
      "sentence: 战狼。很还不错。评分2 is good sentence, with prob: 2.258904018417626e-47 \n",
      "sentence: 战狼!颇还不错!评分2 is good sentence, with prob: 2.258904018417626e-47 \n",
      "sentence: 战狼!颇还不错!评分2 is good sentence, with prob: 2.258904018417626e-47 \n",
      "sentence: 战狼!极还不错。评分2 is good sentence, with prob: 2.258904018417626e-47 \n",
      "sentence: 战狼。颇还不错!评分2 is good sentence, with prob: 2.258904018417626e-47 \n",
      "sentence: 战狼。十分还不错!评分2 is good sentence, with prob: 2.258904018417626e-47 \n",
      "sentence: 战狼!极还不错!评分2 is good sentence, with prob: 2.258904018417626e-47 \n",
      "sentence: 哈利波特!很难看。评分4 is good sentence, with prob: 4.740733511019775e-47 \n",
      "sentence: 哈利波特!很难看!评分4 is good sentence, with prob: 4.740733511019775e-47 \n",
      "sentence: 哈利波特!很难看!评分4 is good sentence, with prob: 4.740733511019775e-47 \n",
      "sentence: 哈利波特!很难看。评分4 is good sentence, with prob: 4.740733511019775e-47 \n",
      "sentence: 哈利波特!很难看。评分3 is good sentence, with prob: 7.4690740214434e-47 \n",
      "sentence: 哈利波特!很难看。评分3 is good sentence, with prob: 7.4690740214434e-47 \n",
      "sentence: 哈利波特。很难看!评分3 is good sentence, with prob: 7.4690740214434e-47 \n",
      "sentence: 哈利波特。很难看。评分3 is good sentence, with prob: 7.4690740214434e-47 \n",
      "sentence: 哈利波特。很难看。评分1 is good sentence, with prob: 8.168896587355843e-47 \n",
      "sentence: 哈利波特!很难看!评分1 is good sentence, with prob: 8.168896587355843e-47 \n",
      "sentence: 哈利波特!很难看。评分1 is good sentence, with prob: 8.168896587355843e-47 \n",
      "sentence: 哈利波特!很难看!评分1 is good sentence, with prob: 8.168896587355843e-47 \n",
      "sentence: 哈利波特。很难看!评分1 is good sentence, with prob: 8.168896587355843e-47 \n",
      "sentence: 哈利波特。很难看。评分1 is good sentence, with prob: 8.168896587355843e-47 \n",
      "sentence: 哈利波特!很难看!评分2 is good sentence, with prob: 9.691091292934982e-47 \n",
      "sentence: 哈利波特。很难看。评分2 is good sentence, with prob: 9.691091292934982e-47 \n",
      "sentence: 战狼!颇燃。评分4 is good sentence, with prob: 2.687230755520918e-44 \n",
      "sentence: 战狼!极难看。评分4 is good sentence, with prob: 2.687230755520918e-44 \n",
      "sentence: 战狼!极难看。评分4 is good sentence, with prob: 2.687230755520918e-44 \n",
      "sentence: 战狼。极难看。评分4 is good sentence, with prob: 2.687230755520918e-44 \n",
      "sentence: 战狼。颇燃!评分4 is good sentence, with prob: 2.687230755520918e-44 \n",
      "sentence: 战狼。极难看。评分4 is good sentence, with prob: 2.687230755520918e-44 \n",
      "sentence: 战狼。极难看。评分4 is good sentence, with prob: 2.687230755520918e-44 \n",
      "sentence: 战狼。极难看!评分4 is good sentence, with prob: 2.687230755520918e-44 \n",
      "sentence: 战狼!颇燃。评分4 is good sentence, with prob: 2.687230755520918e-44 \n",
      "sentence: 战狼!极难看!评分4 is good sentence, with prob: 2.687230755520918e-44 \n",
      "sentence: 战狼!颇燃。评分3 is good sentence, with prob: 4.233759476045202e-44 \n",
      "sentence: 战狼。极难看。评分3 is good sentence, with prob: 4.233759476045202e-44 \n",
      "sentence: 战狼。颇燃!评分3 is good sentence, with prob: 4.233759476045202e-44 \n",
      "sentence: 战狼。极难看。评分3 is good sentence, with prob: 4.233759476045202e-44 \n",
      "sentence: 战狼!颇燃!评分3 is good sentence, with prob: 4.233759476045202e-44 \n",
      "sentence: 战狼。颇燃。评分3 is good sentence, with prob: 4.233759476045202e-44 \n",
      "sentence: 战狼!颇燃!评分3 is good sentence, with prob: 4.233759476045202e-44 \n",
      "sentence: 战狼!颇燃!评分3 is good sentence, with prob: 4.233759476045202e-44 \n",
      "sentence: 战狼!极难看。评分3 is good sentence, with prob: 4.233759476045202e-44 \n",
      "sentence: 战狼!极难看。评分3 is good sentence, with prob: 4.233759476045202e-44 \n",
      "sentence: 战狼!颇燃!评分3 is good sentence, with prob: 4.233759476045202e-44 \n",
      "sentence: 战狼!颇燃。评分1 is good sentence, with prob: 4.630445920907813e-44 \n",
      "sentence: 战狼。颇燃!评分1 is good sentence, with prob: 4.630445920907813e-44 \n",
      "sentence: 战狼。颇燃。评分1 is good sentence, with prob: 4.630445920907813e-44 \n",
      "sentence: 战狼。极难看!评分1 is good sentence, with prob: 4.630445920907813e-44 \n",
      "sentence: 战狼!颇燃!评分1 is good sentence, with prob: 4.630445920907813e-44 \n",
      "sentence: 战狼。极难看!评分1 is good sentence, with prob: 4.630445920907813e-44 \n",
      "sentence: 战狼!颇燃。评分1 is good sentence, with prob: 4.630445920907813e-44 \n",
      "sentence: 战狼。极难看。评分1 is good sentence, with prob: 4.630445920907813e-44 \n",
      "sentence: 战狼!颇燃!评分1 is good sentence, with prob: 4.630445920907813e-44 \n",
      "sentence: 战狼。颇燃。评分1 is good sentence, with prob: 4.630445920907813e-44 \n",
      "sentence: 战狼!极难看!评分1 is good sentence, with prob: 4.630445920907813e-44 \n",
      "sentence: 战狼。颇燃。评分1 is good sentence, with prob: 4.630445920907813e-44 \n",
      "sentence: 战狼。十分燃。评分4 is good sentence, with prob: 5.374461511041836e-44 \n",
      "sentence: 战狼!十分难看。评分4 is good sentence, with prob: 5.374461511041836e-44 \n",
      "sentence: 战狼!十分难看。评分4 is good sentence, with prob: 5.374461511041836e-44 \n",
      "sentence: 战狼!十分难看。评分4 is good sentence, with prob: 5.374461511041836e-44 \n",
      "sentence: 战狼!十分燃。评分4 is good sentence, with prob: 5.374461511041836e-44 \n",
      "sentence: 战狼。十分难看!评分4 is good sentence, with prob: 5.374461511041836e-44 \n",
      "sentence: 战狼。十分难看。评分4 is good sentence, with prob: 5.374461511041836e-44 \n",
      "sentence: 战狼!十分燃。评分4 is good sentence, with prob: 5.374461511041836e-44 \n",
      "sentence: 战狼。十分燃!评分4 is good sentence, with prob: 5.374461511041836e-44 \n",
      "sentence: 战狼。颇难看。评分4 is good sentence, with prob: 5.374461511041836e-44 \n",
      "sentence: 战狼。十分燃!评分4 is good sentence, with prob: 5.374461511041836e-44 \n",
      "sentence: 战狼!十分燃!评分4 is good sentence, with prob: 5.374461511041836e-44 \n",
      "sentence: 战狼!十分难看。评分4 is good sentence, with prob: 5.374461511041836e-44 \n",
      "sentence: 战狼!十分难看!评分4 is good sentence, with prob: 5.374461511041836e-44 \n",
      "sentence: 战狼!颇难看。评分4 is good sentence, with prob: 5.374461511041836e-44 \n",
      "sentence: 战狼!颇难看。评分4 is good sentence, with prob: 5.374461511041836e-44 \n",
      "sentence: 战狼。颇难看!评分4 is good sentence, with prob: 5.374461511041836e-44 \n",
      "sentence: 战狼。颇难看。评分4 is good sentence, with prob: 5.374461511041836e-44 \n",
      "sentence: 战狼!十分燃。评分4 is good sentence, with prob: 5.374461511041836e-44 \n",
      "sentence: 战狼。十分燃!评分4 is good sentence, with prob: 5.374461511041836e-44 \n",
      "sentence: 战狼。十分难看!评分4 is good sentence, with prob: 5.374461511041836e-44 \n",
      "sentence: 战狼。颇难看!评分4 is good sentence, with prob: 5.374461511041836e-44 \n",
      "sentence: 战狼!十分难看。评分4 is good sentence, with prob: 5.374461511041836e-44 \n",
      "sentence: 战狼!十分燃!评分4 is good sentence, with prob: 5.374461511041836e-44 \n",
      "sentence: 战狼!颇燃。评分2 is good sentence, with prob: 5.493284639687319e-44 \n",
      "sentence: 战狼!颇燃!评分2 is good sentence, with prob: 5.493284639687319e-44 \n",
      "sentence: 战狼。颇燃。评分2 is good sentence, with prob: 5.493284639687319e-44 \n",
      "sentence: 战狼。极难看。评分2 is good sentence, with prob: 5.493284639687319e-44 \n",
      "sentence: 战狼。极难看!评分2 is good sentence, with prob: 5.493284639687319e-44 \n",
      "sentence: 战狼!极难看。评分2 is good sentence, with prob: 5.493284639687319e-44 \n",
      "sentence: 战狼!极难看。评分2 is good sentence, with prob: 5.493284639687319e-44 \n",
      "sentence: 战狼。极难看。评分2 is good sentence, with prob: 5.493284639687319e-44 \n",
      "sentence: 战狼。极难看。评分2 is good sentence, with prob: 5.493284639687319e-44 \n",
      "sentence: 战狼。十分难看。评分3 is good sentence, with prob: 8.467518952090404e-44 \n",
      "sentence: 战狼。十分难看!评分3 is good sentence, with prob: 8.467518952090404e-44 \n",
      "sentence: 战狼。十分难看!评分3 is good sentence, with prob: 8.467518952090404e-44 \n",
      "sentence: 战狼!十分燃!评分3 is good sentence, with prob: 8.467518952090404e-44 \n",
      "sentence: 战狼。十分燃。评分3 is good sentence, with prob: 8.467518952090404e-44 \n",
      "sentence: 战狼!十分燃!评分3 is good sentence, with prob: 8.467518952090404e-44 \n",
      "sentence: 战狼!十分难看。评分3 is good sentence, with prob: 8.467518952090404e-44 \n",
      "sentence: 战狼!十分燃!评分3 is good sentence, with prob: 8.467518952090404e-44 \n",
      "sentence: 战狼。十分燃!评分3 is good sentence, with prob: 8.467518952090404e-44 \n",
      "sentence: 战狼!颇难看!评分3 is good sentence, with prob: 8.467518952090404e-44 \n",
      "sentence: 战狼。颇难看!评分3 is good sentence, with prob: 8.467518952090404e-44 \n",
      "sentence: 战狼。十分难看!评分3 is good sentence, with prob: 8.467518952090404e-44 \n",
      "sentence: 战狼。十分燃。评分3 is good sentence, with prob: 8.467518952090404e-44 \n",
      "sentence: 战狼。颇难看。评分1 is good sentence, with prob: 9.260891841815627e-44 \n",
      "sentence: 战狼!十分难看。评分1 is good sentence, with prob: 9.260891841815627e-44 \n",
      "sentence: 战狼!颇难看。评分1 is good sentence, with prob: 9.260891841815627e-44 \n",
      "sentence: 战狼。十分难看。评分1 is good sentence, with prob: 9.260891841815627e-44 \n",
      "sentence: 战狼!十分燃。评分1 is good sentence, with prob: 9.260891841815627e-44 \n",
      "sentence: 战狼!颇难看!评分1 is good sentence, with prob: 9.260891841815627e-44 \n",
      "sentence: 战狼!十分燃!评分1 is good sentence, with prob: 9.260891841815627e-44 \n",
      "sentence: 战狼!十分难看。评分1 is good sentence, with prob: 9.260891841815627e-44 \n",
      "sentence: 战狼!十分燃!评分1 is good sentence, with prob: 9.260891841815627e-44 \n",
      "sentence: 战狼!十分难看。评分1 is good sentence, with prob: 9.260891841815627e-44 \n",
      "sentence: 战狼。颇难看。评分1 is good sentence, with prob: 9.260891841815627e-44 \n",
      "sentence: 战狼。颇难看!评分1 is good sentence, with prob: 9.260891841815627e-44 \n",
      "sentence: 战狼。十分燃!评分1 is good sentence, with prob: 9.260891841815627e-44 \n",
      "sentence: 战狼!十分难看!评分1 is good sentence, with prob: 9.260891841815627e-44 \n",
      "sentence: 战狼!十分燃!评分2 is good sentence, with prob: 1.0986569279374638e-43 \n",
      "sentence: 战狼。十分燃!评分2 is good sentence, with prob: 1.0986569279374638e-43 \n",
      "sentence: 战狼!颇难看!评分2 is good sentence, with prob: 1.0986569279374638e-43 \n",
      "sentence: 战狼。颇难看。评分2 is good sentence, with prob: 1.0986569279374638e-43 \n",
      "sentence: 战狼。十分难看。评分2 is good sentence, with prob: 1.0986569279374638e-43 \n",
      "sentence: 战狼!十分难看。评分2 is good sentence, with prob: 1.0986569279374638e-43 \n",
      "sentence: 战狼!十分燃!评分2 is good sentence, with prob: 1.0986569279374638e-43 \n",
      "sentence: 战狼!十分燃。评分2 is good sentence, with prob: 1.0986569279374638e-43 \n",
      "sentence: 战狼。十分难看。评分2 is good sentence, with prob: 1.0986569279374638e-43 \n",
      "sentence: 战狼。颇难看。评分2 is good sentence, with prob: 1.0986569279374638e-43 \n",
      "sentence: 战狼!十分难看!评分2 is good sentence, with prob: 1.0986569279374638e-43 \n",
      "sentence: 战狼!十分燃!评分2 is good sentence, with prob: 1.0986569279374638e-43 \n",
      "sentence: 战狼。十分燃!评分2 is good sentence, with prob: 1.0986569279374638e-43 \n",
      "sentence: 战狼!颇难看!评分2 is good sentence, with prob: 1.0986569279374638e-43 \n",
      "sentence: 战狼!十分难看。评分2 is good sentence, with prob: 1.0986569279374638e-43 \n",
      "sentence: 战狼。颇难看!评分2 is good sentence, with prob: 1.0986569279374638e-43 \n",
      "sentence: 战狼!十分燃。评分2 is good sentence, with prob: 1.0986569279374638e-43 \n",
      "sentence: 战狼!十分燃。评分2 is good sentence, with prob: 1.0986569279374638e-43 \n",
      "sentence: 战狼。十分难看。评分2 is good sentence, with prob: 1.0986569279374638e-43 \n",
      "sentence: 战狼。十分难看!评分2 is good sentence, with prob: 1.0986569279374638e-43 \n",
      "sentence: 战狼!十分难看。评分2 is good sentence, with prob: 1.0986569279374638e-43 \n",
      "sentence: 唐顿庄园。极燃!评分4 is good sentence, with prob: 1.6123384533125515e-43 \n",
      "sentence: 唐顿庄园。很燃。评分4 is good sentence, with prob: 1.6123384533125515e-43 \n",
      "sentence: 唐顿庄园。极燃。评分4 is good sentence, with prob: 1.6123384533125515e-43 \n",
      "sentence: 唐顿庄园!极燃!评分4 is good sentence, with prob: 1.6123384533125515e-43 \n",
      "sentence: 唐顿庄园!极燃。评分4 is good sentence, with prob: 1.6123384533125515e-43 \n",
      "sentence: 唐顿庄园!很燃!评分4 is good sentence, with prob: 1.6123384533125515e-43 \n",
      "sentence: 唐顿庄园!极燃。评分4 is good sentence, with prob: 1.6123384533125515e-43 \n",
      "sentence: 唐顿庄园!极燃。评分4 is good sentence, with prob: 1.6123384533125515e-43 \n",
      "sentence: 唐顿庄园。极燃!评分4 is good sentence, with prob: 1.6123384533125515e-43 \n",
      "sentence: 唐顿庄园。很燃。评分4 is good sentence, with prob: 1.6123384533125515e-43 \n",
      "sentence: 唐顿庄园!极燃!评分4 is good sentence, with prob: 1.6123384533125515e-43 \n",
      "sentence: 唐顿庄园。极燃!评分4 is good sentence, with prob: 1.6123384533125515e-43 \n",
      "sentence: 唐顿庄园。极燃。评分4 is good sentence, with prob: 1.6123384533125515e-43 \n",
      "sentence: 唐顿庄园。很燃。评分4 is good sentence, with prob: 1.6123384533125515e-43 \n",
      "sentence: 唐顿庄园!极燃。评分4 is good sentence, with prob: 1.6123384533125515e-43 \n",
      "sentence: 唐顿庄园。极燃!评分4 is good sentence, with prob: 1.6123384533125515e-43 \n",
      "sentence: 唐顿庄园!很燃。评分3 is good sentence, with prob: 2.540255685627122e-43 \n",
      "sentence: 唐顿庄园。很燃!评分3 is good sentence, with prob: 2.540255685627122e-43 \n",
      "sentence: 唐顿庄园!很燃。评分3 is good sentence, with prob: 2.540255685627122e-43 \n",
      "sentence: 唐顿庄园。很燃!评分3 is good sentence, with prob: 2.540255685627122e-43 \n",
      "sentence: 唐顿庄园!很燃!评分3 is good sentence, with prob: 2.540255685627122e-43 \n",
      "sentence: 唐顿庄园!极燃!评分3 is good sentence, with prob: 2.540255685627122e-43 \n",
      "sentence: 唐顿庄园!很燃。评分3 is good sentence, with prob: 2.540255685627122e-43 \n",
      "sentence: 唐顿庄园!极燃!评分1 is good sentence, with prob: 2.778267552544689e-43 \n",
      "sentence: 唐顿庄园。很燃!评分1 is good sentence, with prob: 2.778267552544689e-43 \n",
      "sentence: 唐顿庄园!极燃!评分1 is good sentence, with prob: 2.778267552544689e-43 \n",
      "sentence: 唐顿庄园!很燃。评分1 is good sentence, with prob: 2.778267552544689e-43 \n",
      "sentence: 唐顿庄园!极燃。评分2 is good sentence, with prob: 3.295970783812393e-43 \n",
      "sentence: 唐顿庄园。很燃!评分2 is good sentence, with prob: 3.295970783812393e-43 \n",
      "sentence: 唐顿庄园!极燃!评分2 is good sentence, with prob: 3.295970783812393e-43 \n",
      "sentence: 唐顿庄园。很燃。评分2 is good sentence, with prob: 3.295970783812393e-43 \n",
      "sentence: 唐顿庄园。很燃!评分2 is good sentence, with prob: 3.295970783812393e-43 \n",
      "sentence: 唐顿庄园。很燃。评分2 is good sentence, with prob: 3.295970783812393e-43 \n",
      "sentence: 唐顿庄园!极燃。评分2 is good sentence, with prob: 3.295970783812393e-43 \n",
      "sentence: 湄公河行动。极燃。评分4 is good sentence, with prob: 4.299569208833469e-43 \n",
      "sentence: 湄公河行动!极燃!评分4 is good sentence, with prob: 4.299569208833469e-43 \n",
      "sentence: 湄公河行动。很燃!评分4 is good sentence, with prob: 4.299569208833469e-43 \n",
      "sentence: 湄公河行动。很燃!评分4 is good sentence, with prob: 4.299569208833469e-43 \n",
      "sentence: 湄公河行动!很燃。评分4 is good sentence, with prob: 4.299569208833469e-43 \n",
      "sentence: 湄公河行动。极燃!评分3 is good sentence, with prob: 6.774015161672323e-43 \n",
      "sentence: 湄公河行动。很燃。评分3 is good sentence, with prob: 6.774015161672323e-43 \n",
      "sentence: 湄公河行动。很燃。评分3 is good sentence, with prob: 6.774015161672323e-43 \n",
      "sentence: 湄公河行动!极燃。评分3 is good sentence, with prob: 6.774015161672323e-43 \n",
      "sentence: 湄公河行动。很燃!评分3 is good sentence, with prob: 6.774015161672323e-43 \n",
      "sentence: 湄公河行动!很燃。评分3 is good sentence, with prob: 6.774015161672323e-43 \n",
      "sentence: 湄公河行动。极燃。评分3 is good sentence, with prob: 6.774015161672323e-43 \n",
      "sentence: 湄公河行动!极燃。评分3 is good sentence, with prob: 6.774015161672323e-43 \n",
      "sentence: 湄公河行动!极燃。评分1 is good sentence, with prob: 7.408713473452501e-43 \n",
      "sentence: 湄公河行动!很燃!评分1 is good sentence, with prob: 7.408713473452501e-43 \n",
      "sentence: 湄公河行动。很燃!评分1 is good sentence, with prob: 7.408713473452501e-43 \n",
      "sentence: 湄公河行动!很燃。评分1 is good sentence, with prob: 7.408713473452501e-43 \n",
      "sentence: 湄公河行动!很燃!评分1 is good sentence, with prob: 7.408713473452501e-43 \n",
      "sentence: 湄公河行动。很燃。评分1 is good sentence, with prob: 7.408713473452501e-43 \n",
      "sentence: 湄公河行动!很燃。评分1 is good sentence, with prob: 7.408713473452501e-43 \n",
      "sentence: 湄公河行动!很燃!评分1 is good sentence, with prob: 7.408713473452501e-43 \n",
      "sentence: 湄公河行动!很燃!评分1 is good sentence, with prob: 7.408713473452501e-43 \n",
      "sentence: 湄公河行动。极燃。评分1 is good sentence, with prob: 7.408713473452501e-43 \n",
      "sentence: 湄公河行动。极燃!评分1 is good sentence, with prob: 7.408713473452501e-43 \n",
      "sentence: 湄公河行动。很燃!评分1 is good sentence, with prob: 7.408713473452501e-43 \n",
      "sentence: 湄公河行动!很燃!评分1 is good sentence, with prob: 7.408713473452501e-43 \n",
      "sentence: 湄公河行动!极燃。评分1 is good sentence, with prob: 7.408713473452501e-43 \n",
      "sentence: 湄公河行动。很燃。评分1 is good sentence, with prob: 7.408713473452501e-43 \n",
      "sentence: 湄公河行动!极燃。评分1 is good sentence, with prob: 7.408713473452501e-43 \n",
      "sentence: 湄公河行动。很燃。评分2 is good sentence, with prob: 8.78925542349971e-43 \n",
      "sentence: 湄公河行动。很燃!评分2 is good sentence, with prob: 8.78925542349971e-43 \n",
      "sentence: 湄公河行动!极燃。评分2 is good sentence, with prob: 8.78925542349971e-43 \n",
      "sentence: 湄公河行动!极燃!评分2 is good sentence, with prob: 8.78925542349971e-43 \n",
      "sentence: 湄公河行动!极燃!评分2 is good sentence, with prob: 8.78925542349971e-43 \n",
      "sentence: 湄公河行动。很燃。评分2 is good sentence, with prob: 8.78925542349971e-43 \n",
      "sentence: 湄公河行动!很燃!评分2 is good sentence, with prob: 8.78925542349971e-43 \n",
      "sentence: 湄公河行动。很燃。评分2 is good sentence, with prob: 8.78925542349971e-43 \n",
      "sentence: 湄公河行动!极燃!评分2 is good sentence, with prob: 8.78925542349971e-43 \n",
      "sentence: 湄公河行动。很燃!评分2 is good sentence, with prob: 8.78925542349971e-43 \n",
      "sentence: 湄公河行动。极燃。评分2 is good sentence, with prob: 8.78925542349971e-43 \n",
      "sentence: 湄公河行动!很燃!评分2 is good sentence, with prob: 8.78925542349971e-43 \n",
      "sentence: 湄公河行动!极燃!评分2 is good sentence, with prob: 8.78925542349971e-43 \n",
      "sentence: 战狼。很难看。评分4 is good sentence, with prob: 2.230401527082362e-42 \n",
      "sentence: 战狼。很难看。评分4 is good sentence, with prob: 2.230401527082362e-42 \n",
      "sentence: 战狼!很难看。评分4 is good sentence, with prob: 2.230401527082362e-42 \n",
      "sentence: 战狼!很难看!评分4 is good sentence, with prob: 2.230401527082362e-42 \n",
      "sentence: 战狼。很难看。评分4 is good sentence, with prob: 2.230401527082362e-42 \n",
      "sentence: 战狼。很难看!评分4 is good sentence, with prob: 2.230401527082362e-42 \n",
      "sentence: 战狼。很难看!评分4 is good sentence, with prob: 2.230401527082362e-42 \n",
      "sentence: 战狼!很难看。评分4 is good sentence, with prob: 2.230401527082362e-42 \n",
      "sentence: 哈利波特。极燃!评分4 is good sentence, with prob: 2.7141030630761275e-42 \n",
      "sentence: 哈利波特。极燃!评分4 is good sentence, with prob: 2.7141030630761275e-42 \n",
      "sentence: 哈利波特!很燃。评分4 is good sentence, with prob: 2.7141030630761275e-42 \n",
      "sentence: 哈利波特!极燃!评分4 is good sentence, with prob: 2.7141030630761275e-42 \n",
      "sentence: 哈利波特。很燃。评分4 is good sentence, with prob: 2.7141030630761275e-42 \n",
      "sentence: 哈利波特。很燃!评分4 is good sentence, with prob: 2.7141030630761275e-42 \n",
      "sentence: 哈利波特!很燃。评分4 is good sentence, with prob: 2.7141030630761275e-42 \n",
      "sentence: 哈利波特。极燃!评分4 is good sentence, with prob: 2.7141030630761275e-42 \n",
      "sentence: 哈利波特。极燃!评分4 is good sentence, with prob: 2.7141030630761275e-42 \n",
      "sentence: 哈利波特!极燃!评分4 is good sentence, with prob: 2.7141030630761275e-42 \n",
      "sentence: 哈利波特。很燃!评分4 is good sentence, with prob: 2.7141030630761275e-42 \n",
      "sentence: 哈利波特。极燃!评分4 is good sentence, with prob: 2.7141030630761275e-42 \n",
      "sentence: 哈利波特。很燃!评分4 is good sentence, with prob: 2.7141030630761275e-42 \n",
      "sentence: 哈利波特。很燃。评分4 is good sentence, with prob: 2.7141030630761275e-42 \n",
      "sentence: 战狼!很难看!评分3 is good sentence, with prob: 3.514020365117518e-42 \n",
      "sentence: 战狼!很难看。评分3 is good sentence, with prob: 3.514020365117518e-42 \n",
      "sentence: 战狼。很难看。评分3 is good sentence, with prob: 3.514020365117518e-42 \n",
      "sentence: 战狼。很难看!评分3 is good sentence, with prob: 3.514020365117518e-42 \n",
      "sentence: 战狼。很难看。评分3 is good sentence, with prob: 3.514020365117518e-42 \n",
      "sentence: 战狼。很难看。评分3 is good sentence, with prob: 3.514020365117518e-42 \n",
      "sentence: 战狼。很难看。评分3 is good sentence, with prob: 3.514020365117518e-42 \n",
      "sentence: 战狼!很难看!评分3 is good sentence, with prob: 3.514020365117518e-42 \n",
      "sentence: 战狼!很难看。评分1 is good sentence, with prob: 3.843270114353485e-42 \n",
      "sentence: 战狼!很难看!评分1 is good sentence, with prob: 3.843270114353485e-42 \n",
      "sentence: 战狼!很难看!评分1 is good sentence, with prob: 3.843270114353485e-42 \n",
      "sentence: 战狼。很难看。评分1 is good sentence, with prob: 3.843270114353485e-42 \n",
      "sentence: 战狼!很难看!评分1 is good sentence, with prob: 3.843270114353485e-42 \n",
      "sentence: 战狼!很难看!评分1 is good sentence, with prob: 3.843270114353485e-42 \n",
      "sentence: 哈利波特。极燃!评分3 is good sentence, with prob: 4.2760970708056545e-42 \n",
      "sentence: 哈利波特!极燃。评分3 is good sentence, with prob: 4.2760970708056545e-42 \n",
      "sentence: 哈利波特。很燃!评分3 is good sentence, with prob: 4.2760970708056545e-42 \n",
      "sentence: 哈利波特。很燃。评分3 is good sentence, with prob: 4.2760970708056545e-42 \n",
      "sentence: 哈利波特。很燃。评分3 is good sentence, with prob: 4.2760970708056545e-42 \n",
      "sentence: 哈利波特。极燃!评分3 is good sentence, with prob: 4.2760970708056545e-42 \n",
      "sentence: 哈利波特!极燃。评分3 is good sentence, with prob: 4.2760970708056545e-42 \n",
      "sentence: 哈利波特!极燃。评分3 is good sentence, with prob: 4.2760970708056545e-42 \n",
      "sentence: 哈利波特!很燃!评分3 is good sentence, with prob: 4.2760970708056545e-42 \n",
      "sentence: 哈利波特!很燃!评分3 is good sentence, with prob: 4.2760970708056545e-42 \n",
      "sentence: 哈利波特。极燃!评分3 is good sentence, with prob: 4.2760970708056545e-42 \n",
      "sentence: 哈利波特!很燃!评分3 is good sentence, with prob: 4.2760970708056545e-42 \n",
      "sentence: 哈利波特。极燃!评分3 is good sentence, with prob: 4.2760970708056545e-42 \n",
      "sentence: 哈利波特。很燃!评分3 is good sentence, with prob: 4.2760970708056545e-42 \n",
      "sentence: 哈利波特。很燃!评分3 is good sentence, with prob: 4.2760970708056545e-42 \n",
      "sentence: 哈利波特。极燃。评分3 is good sentence, with prob: 4.2760970708056545e-42 \n",
      "sentence: 战狼。很难看。评分2 is good sentence, with prob: 4.559426250940475e-42 \n",
      "sentence: 战狼!很难看!评分2 is good sentence, with prob: 4.559426250940475e-42 \n",
      "sentence: 战狼!很难看!评分2 is good sentence, with prob: 4.559426250940475e-42 \n",
      "sentence: 哈利波特!很燃!评分1 is good sentence, with prob: 4.676750380116892e-42 \n",
      "sentence: 哈利波特。极燃!评分1 is good sentence, with prob: 4.676750380116892e-42 \n",
      "sentence: 哈利波特。极燃!评分1 is good sentence, with prob: 4.676750380116892e-42 \n",
      "sentence: 哈利波特。很燃!评分1 is good sentence, with prob: 4.676750380116892e-42 \n",
      "sentence: 哈利波特!很燃。评分1 is good sentence, with prob: 4.676750380116892e-42 \n",
      "sentence: 哈利波特。极燃!评分1 is good sentence, with prob: 4.676750380116892e-42 \n",
      "sentence: 哈利波特!很燃!评分1 is good sentence, with prob: 4.676750380116892e-42 \n",
      "sentence: 哈利波特。极燃。评分1 is good sentence, with prob: 4.676750380116892e-42 \n",
      "sentence: 哈利波特。极燃!评分1 is good sentence, with prob: 4.676750380116892e-42 \n",
      "sentence: 哈利波特!极燃!评分1 is good sentence, with prob: 4.676750380116892e-42 \n",
      "sentence: 哈利波特!很燃!评分1 is good sentence, with prob: 4.676750380116892e-42 \n",
      "sentence: 哈利波特!很燃。评分1 is good sentence, with prob: 4.676750380116892e-42 \n",
      "sentence: 哈利波特!很燃!评分1 is good sentence, with prob: 4.676750380116892e-42 \n",
      "sentence: 哈利波特!很燃。评分1 is good sentence, with prob: 4.676750380116892e-42 \n",
      "sentence: 哈利波特!极燃!评分1 is good sentence, with prob: 4.676750380116892e-42 \n",
      "sentence: 哈利波特!很燃!评分2 is good sentence, with prob: 5.548217486084193e-42 \n",
      "sentence: 哈利波特。很燃!评分2 is good sentence, with prob: 5.548217486084193e-42 \n",
      "sentence: 哈利波特。很燃!评分2 is good sentence, with prob: 5.548217486084193e-42 \n",
      "sentence: 哈利波特。很燃!评分2 is good sentence, with prob: 5.548217486084193e-42 \n",
      "sentence: 哈利波特。很燃。评分2 is good sentence, with prob: 5.548217486084193e-42 \n",
      "sentence: 哈利波特。极燃!评分2 is good sentence, with prob: 5.548217486084193e-42 \n",
      "sentence: 哈利波特!极燃。评分2 is good sentence, with prob: 5.548217486084193e-42 \n",
      "sentence: 哈利波特!很燃!评分2 is good sentence, with prob: 5.548217486084193e-42 \n",
      "sentence: 哈利波特。极燃。评分2 is good sentence, with prob: 5.548217486084193e-42 \n",
      "sentence: 哈利波特!很燃。评分2 is good sentence, with prob: 5.548217486084193e-42 \n",
      "sentence: 哈利波特!极燃。评分2 is good sentence, with prob: 5.548217486084193e-42 \n",
      "sentence: 战狼。很燃!评分4 is good sentence, with prob: 1.2769204601930343e-37 \n",
      "sentence: 战狼!极燃!评分4 is good sentence, with prob: 1.2769204601930343e-37 \n",
      "sentence: 战狼。极燃。评分4 is good sentence, with prob: 1.2769204601930343e-37 \n",
      "sentence: 战狼!很燃!评分4 is good sentence, with prob: 1.2769204601930343e-37 \n",
      "sentence: 战狼。极燃。评分4 is good sentence, with prob: 1.2769204601930343e-37 \n",
      "sentence: 战狼。很燃!评分4 is good sentence, with prob: 1.2769204601930343e-37 \n",
      "sentence: 战狼!极燃!评分4 is good sentence, with prob: 1.2769204601930343e-37 \n",
      "sentence: 战狼。很燃!评分4 is good sentence, with prob: 1.2769204601930343e-37 \n",
      "sentence: 战狼。很燃!评分4 is good sentence, with prob: 1.2769204601930343e-37 \n",
      "sentence: 战狼。很燃!评分4 is good sentence, with prob: 1.2769204601930343e-37 \n",
      "sentence: 战狼!极燃!评分4 is good sentence, with prob: 1.2769204601930343e-37 \n",
      "sentence: 战狼。很燃。评分4 is good sentence, with prob: 1.2769204601930343e-37 \n",
      "sentence: 战狼。很燃。评分4 is good sentence, with prob: 1.2769204601930343e-37 \n",
      "sentence: 战狼!极燃。评分4 is good sentence, with prob: 1.2769204601930343e-37 \n",
      "sentence: 战狼。极燃!评分3 is good sentence, with prob: 2.0118012148347397e-37 \n",
      "sentence: 战狼。极燃。评分3 is good sentence, with prob: 2.0118012148347397e-37 \n",
      "sentence: 战狼。很燃!评分3 is good sentence, with prob: 2.0118012148347397e-37 \n",
      "sentence: 战狼!极燃!评分3 is good sentence, with prob: 2.0118012148347397e-37 \n",
      "sentence: 战狼。极燃!评分3 is good sentence, with prob: 2.0118012148347397e-37 \n",
      "sentence: 战狼。很燃。评分3 is good sentence, with prob: 2.0118012148347397e-37 \n",
      "sentence: 战狼!极燃!评分3 is good sentence, with prob: 2.0118012148347397e-37 \n",
      "sentence: 战狼!很燃。评分3 is good sentence, with prob: 2.0118012148347397e-37 \n",
      "sentence: 战狼。很燃。评分3 is good sentence, with prob: 2.0118012148347397e-37 \n",
      "sentence: 战狼!很燃。评分3 is good sentence, with prob: 2.0118012148347397e-37 \n",
      "sentence: 战狼。很燃!评分1 is good sentence, with prob: 2.2002989970537116e-37 \n",
      "sentence: 战狼。很燃!评分1 is good sentence, with prob: 2.2002989970537116e-37 \n",
      "sentence: 战狼。很燃!评分1 is good sentence, with prob: 2.2002989970537116e-37 \n",
      "sentence: 战狼。极燃!评分1 is good sentence, with prob: 2.2002989970537116e-37 \n",
      "sentence: 战狼。很燃。评分1 is good sentence, with prob: 2.2002989970537116e-37 \n",
      "sentence: 战狼。极燃!评分1 is good sentence, with prob: 2.2002989970537116e-37 \n",
      "sentence: 战狼。很燃!评分1 is good sentence, with prob: 2.2002989970537116e-37 \n",
      "sentence: 战狼!极燃!评分1 is good sentence, with prob: 2.2002989970537116e-37 \n",
      "sentence: 战狼。很燃。评分1 is good sentence, with prob: 2.2002989970537116e-37 \n",
      "sentence: 战狼!很燃!评分1 is good sentence, with prob: 2.2002989970537116e-37 \n",
      "sentence: 战狼!极燃!评分1 is good sentence, with prob: 2.2002989970537116e-37 \n",
      "sentence: 战狼!极燃。评分1 is good sentence, with prob: 2.2002989970537116e-37 \n",
      "sentence: 战狼。很燃。评分2 is good sentence, with prob: 2.610303389714332e-37 \n",
      "sentence: 战狼!极燃!评分2 is good sentence, with prob: 2.610303389714332e-37 \n",
      "sentence: 战狼。极燃!评分2 is good sentence, with prob: 2.610303389714332e-37 \n",
      "sentence: 战狼。很燃!评分2 is good sentence, with prob: 2.610303389714332e-37 \n",
      "sentence: 战狼!极燃!评分2 is good sentence, with prob: 2.610303389714332e-37 \n",
      "sentence: 战狼。很燃!评分2 is good sentence, with prob: 2.610303389714332e-37 \n",
      "sentence: 战狼!很燃。评分2 is good sentence, with prob: 2.610303389714332e-37 \n",
      "sentence: 战狼!很燃。评分2 is good sentence, with prob: 2.610303389714332e-37 \n",
      "sentence: 战狼。极燃。评分2 is good sentence, with prob: 2.610303389714332e-37 \n",
      "sentence: 战狼。很燃!评分2 is good sentence, with prob: 2.610303389714332e-37 \n"
     ]
    }
   ],
   "source": [
    "generate_best(sen2prob)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3.2 问题改进"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "1. 语法所生成的句子，较单一， 不够多样化\n",
    "2. 因为依赖的是条件概率相乘，同等情况下，短句会比长句具有优势，概率要大\n",
    "3. 对训练语言模型所使用的语料库要求较高，依赖严重，如果希望结果可靠，同时需要数据量较大，以及数据相关"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
